User login
Trends in Risk-Adjusted 28-Day Mortality Rates for Patients Hospitalized with COVID-19 in England
The early phase of the coronavirus disease 2019 (COVID-19) pandemic in the United Kingdom (UK) was characterized by uncertainty as clinicians grappled to understand and manage an unfamiliar disease that affected very high numbers of patients amid radically evolving working environments, with little evidence to support their efforts. Early reports indicated high mortality in patients hospitalized with COVID-19.
As the disease became better understood, treatment evolved and the mortality appears to have decreased. For example, two recent papers, a national study of critical care patients in the UK and a single-site study from New York, have demonstrated a significant reduction in adjusted mortality between the pre- and post-peak periods.1,2 However, the UK study was restricted to patients receiving critical care, potentially introducing bias due to varying critical care admission thresholds over time, while the single-site US study may not be generalizable. Moreover, both studies measured only in-hospital mortality. It remains uncertain therefore whether overall mortality has decreased on a broad scale after accounting for changes in patient characteristics.
The aim of this study was to use a national dataset to assess the
METHODS
We conducted a retrospective, secondary analysis of English National Health Services (NHS) hospitals’ admissions of patients at least 18 years of age between March 1 and July 31, 2020. Data were obtained from the Hospital Episode Statistics (HES) admitted patient care dataset.3 This is an administrative dataset that contains data on diagnoses and procedures as well as organizational characteristics and patient demographics for all NHS activity in England. We included all patients with an International Statistical Classification of Diseases, Tenth Revision (ICD-10) diagnosis of U07.1 (COVID-19, virus identified) and U07.2 (COVID-19, virus not identified).
The primary outcome of death within 28 days of admission was obtained by linking to the Civil Registrations (Deaths) - Secondary Care Cut - Information dataset, which includes the date, place, and cause of death from the Office for National Statistics4 and which was complete through September 31, 2020. The time horizon of 28 days from admission was chosen to approximate the Public Health England definition of a death from COVID-19 as being within 28 days of testing positive.5 We restricted our analysis to emergency admissions of persons age >18 years. If a patient had multiple emergency admissions, we restricted our analysis to the first admission to ensure comparability across hospitalizations and to best represent outcomes from the earliest onset of COVID-19.
We estimated a modified Poisson regression6 to predict death at 28 days, with month of admission, region, source of admission, age, deprivation, gender, ethnic group, and the 29 comorbidities in the Elixhauser comorbidity measure as variables in the regression.7 The derivation of each of these variables from the HES dataset is shown in Appendix Table 1.
Deprivation was measured by the Index of Multiple Deprivation, a methodology used widely within the UK to classify relative deprivation.8 To control for clustering, hospital system (known as Trust) was added as a random effect. Robust errors were estimated using the sandwich package.9 Modified Poisson regression was chosen in preference to the more common logistic regression because the coefficients can be interpreted as relative risks and not odds ratios. The model was fitted using R, version 4.0.3, geepack library.10 We carried out three sensitivity analyses, restricting to laboratory-confirmed COVID-19, length of stay ≥3 days, and primary respiratory disease.
For each month, we obtained a standardized mortality ratio (SMR) by fixing the month to the reference month of March 2020 and repredicting the outcome using the existing model. We calculated the ratio of the sum of observed and expected deaths (obtained from the model) in each month, comparing observed deaths to the number we would have expected had those patients been hospitalized in March. We then multiplied each period’s SMR by the March crude mortality to generate monthly adjusted mortality rates. We calculated Poisson confidence intervals around the SMR and used these to obtain confidence intervals for the adjusted rate. The binomial exact method was used to obtain confidence intervals for the crude rate. Multicollinearity was assessed using both the variance inflation factor (VIF) and the condition number test.7 All analyses used two-sided statistical tests, and we considered a P value < .05 to be statistically significant without adjustment for multiple testing. The study was exempt from UK National Research Ethics Committee approval because it involved secondary analysis of anonymized data.
RESULTS
The dataset included 115,643 emergency admissions from 179 healthcare systems, of which 103,202 were first admissions eligible for inclusion. A total of 592 patients were excluded due to missing demographic data (0.5%), resulting in 102,610 admissions included in the analysis. Peak hospitalizations occurred in late March to mid April, accounting for 44% of the hospitalizations (Table). Median length of stay for patients who died was 7 days (interquartile range, 3-12). The median age and number of Elixhauser comorbidities decreased in July. The proportion of men decreased between May and July.
The modified Poisson regression had a C statistic of 0.743 (95% CI, 0.740-0.746) (Appendix Table 4). The VIF and condition number test found no evidence of multicollinearity.11
Adjusted mortality decreased each month, from 33.4% in March to 17.4% in July (Figure). The relative risk of death declined progressively to a minimum of 0.52 (95% CI, 0.34-0.80) in July, compared to March.
Admission from another hospital and being female were associated with reduced risk of death. Admission from a skilled nursing facility and being >75 years were associated with increased risk of death. Ten of the 29 Elixhauser comorbidities were associated with increased risk of mortality (cardiac arrhythmia, peripheral vascular disease, other neurologic disorders, renal failure, lymphoma, metastatic cancer, solid tumor without metastasis, coagulopathy, fluid and electrolyte disorders, and anemia). Deprivation and ethnic group were not associated with death among hospitalized patients.
DISCUSSION
Our study of all emergency hospital admissions in England during the first wave of the COVID-19 pandemic demonstrated that, even after adjusting for patient comorbidity and risk factors, the mortality rate decreased by approximately half over the first 5 months. Although the demographics of hospitalized patients changed over that period (with both the median age and the number of comorbidities decreasing), this does not fully explain the decrease in mortality. It is therefore likely that the decrease is due, at least in part, to an improvement in treatment and/or a reduction in hospital strain.
For example, initially the use of corticosteroids was controversial, in part due to previous experience with severe acute respiratory syndrome and Middle East respiratory syndrome (in which a Cochrane review demonstrated no benefit but potential harm). However, this changed as a result of the Randomized Evaluation of Covid-19 Therapy (RECOVERY) trial,12 which showed a significant survival benefit.One of the positive defining characteristics of the COVID-19 pandemic has been the intensive collaborative research effort combined with the rapid dissemination and discussion of new management protocols. The RECOVERY trial randomly assigned >11,000 participants in just 3 months, amounting to approximately 15% of all patients hospitalized with COVID-19 in the UK. Its results were widely publicized via professional networks and rapidly adopted into widespread clinical practice.
Examples of other changes include a higher threshold for mechanical ventilation (and a lower threshold for noninvasive ventilation), increased clinician experience, and, potentially, a reduced viral load arising from increased social distancing and mask wearing. Finally, the hospitals and staff themselves were under enormous physical and mental strain in the early months from multiple factors, including unfamiliar working environments, the large-scale redeployment of inexperienced staff, and very high numbers of patients with an unfamiliar disease. These factors all lessened as the initial peak passed. It is therefore likely that the reduction in adjusted mortality we observed arises from a combination of all these factors, as well as other incremental benefits.
The factors associated with increased mortality risk in our study (increasing age, male gender, certain comorbidities, and frailty [with care home residency acting as a proxy in our study]) are consistent with multiple previous reports. Although not the focus of our analysis, we found no effect of ethnicity or deprivation on mortality. This is consistent with many US studies that demonstrate that the widely reported effect of these factors is likely due to differences in exposure to the disease. Once patients are hospitalized, adjusted mortality risks are similar across ethnic groups and deprivation levels.
The strengths of this study include complete capture of hospitalizations across all hospitals and areas in England. Likewise, linking the hospital data to death data from the Office for National Statistics allows complete capture of outcomes, irrespective of where the patient died. This is a significant strength compared to prior studies, which only included in-hospital mortality. Our results are therefore likely robust and a true observation of the mortality trend.
Limitations include the lack of physiologic and laboratory data; having these would have allowed us to adjust for disease severity on admission and strengthened the risk stratification. Likewise, although the complete national coverage is overall a significant strength, aggregating data from numerous areas that might be at different stages of local outbreaks, have different management strategies, and have differing data quality introduces its own biases.
Furthermore, these results predate the second wave in the UK, so we cannot distinguish whether the reduced mortality is due to improved treatment, a seasonal effect, evolution of the virus itself, or a reduction in the strain on hospitals.
CONCLUSION
This nationwide study indicates that, even after accounting for changing patient characteristics, the mortality of patients hospitalized with COVID-19 in England decreased significantly as the outbreak progressed. This is likely due to a combination of incremental treatment improvements.
1. Horwitz LI, Jones SA, Cerfolio RJ, et al. Trends in COVID-19 risk-adjusted mortality rates. J Hosp Med. 2020;16(2):90-92. https://doi.org/10.12788/jhm.3552
2. Dennis JM, McGovern AP, Vollmer SJ, Mateen BA. Improving survival of critical care patients with coronavirus disease 2019 in England: a national cohort study, March to June 2020. Crit Care Med. 2021;49(2):209-214. https://doi.org/10.1097/CCM.0000000000004747
3. NHS Digital. Hospital Episode Statistics Data Dictionary. Published March 2018. Accessed October 15, 2020. https://digital.nhs.uk/data-and-information/data-tools-and-services/data-services/hospital-episode-statistics/hospital-episode-statistics-data-dictionary
4. NHS Digital. HES-ONS Linked Mortality Data Dictionary. Accessed October 15, 2020. https://digital.nhs.uk/binaries/content/assets/legacy/word/i/p/hes-ons_linked_mortality_data_dictionary_-_mar_20181.docx
5. Public Health England. Technical summary: Public Health England data series on deaths in people with COVID-19. Accessed November 11, 2020. https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/916035/RA_Technical_Summary_-_PHE_Data_Series_COVID_19_Deaths_20200812.pdf
6. Zou G. A modified Poisson regression approach to prospective studies with binary data. Am J Epidemiol. 2004;159(7):702-706. https://doi.org/10.1093/aje/kwh090
7. van Walraven C, Austin PC, Jennings A, et al. A modification of the Elixhauser comorbidity measures into a point system for hospital death using administrative data. Med Care. 2009;47(6):626-633. https://doi.org /10.1097/MLR.0b013e31819432e5
8. Ministry of Housing Communities & Local Government. The English Indices of Deprivation 2019 (IoD2019). Published September 26, 2020. Accessed January 15, 2021. https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/835115/IoD2019_Statistical_Release.pdf
9. Zeileis A. Object-oriented computation of sandwich estimators. J Stat Software. 2006;16:1-16. https://doi.org/10.18637/jss.v016.i09
10. Højsgaard S, Halekoh U, Yan J. The R package geepack for generalized estimating equations. J Stat Software. 2006;15:1-11. https://doi.org/10.18637/jss.v015.i02
11. Belsley DA, Kuh E, Welsch RE. Diagnostics: Identifying Influential Data and Sources of Collinearity. John Wiley & Sons; 1980.
12. RECOVERY Collaborative Group, Horby P, Lim WS, Emberson JR, et al. Dexamethasone in hospitalized patients with covid-19 - preliminary report. N Engl J Med. 2020:NEJMoa2021436. https://doi.org/10.1056/NEJMoa2021436
The early phase of the coronavirus disease 2019 (COVID-19) pandemic in the United Kingdom (UK) was characterized by uncertainty as clinicians grappled to understand and manage an unfamiliar disease that affected very high numbers of patients amid radically evolving working environments, with little evidence to support their efforts. Early reports indicated high mortality in patients hospitalized with COVID-19.
As the disease became better understood, treatment evolved and the mortality appears to have decreased. For example, two recent papers, a national study of critical care patients in the UK and a single-site study from New York, have demonstrated a significant reduction in adjusted mortality between the pre- and post-peak periods.1,2 However, the UK study was restricted to patients receiving critical care, potentially introducing bias due to varying critical care admission thresholds over time, while the single-site US study may not be generalizable. Moreover, both studies measured only in-hospital mortality. It remains uncertain therefore whether overall mortality has decreased on a broad scale after accounting for changes in patient characteristics.
The aim of this study was to use a national dataset to assess the
METHODS
We conducted a retrospective, secondary analysis of English National Health Services (NHS) hospitals’ admissions of patients at least 18 years of age between March 1 and July 31, 2020. Data were obtained from the Hospital Episode Statistics (HES) admitted patient care dataset.3 This is an administrative dataset that contains data on diagnoses and procedures as well as organizational characteristics and patient demographics for all NHS activity in England. We included all patients with an International Statistical Classification of Diseases, Tenth Revision (ICD-10) diagnosis of U07.1 (COVID-19, virus identified) and U07.2 (COVID-19, virus not identified).
The primary outcome of death within 28 days of admission was obtained by linking to the Civil Registrations (Deaths) - Secondary Care Cut - Information dataset, which includes the date, place, and cause of death from the Office for National Statistics4 and which was complete through September 31, 2020. The time horizon of 28 days from admission was chosen to approximate the Public Health England definition of a death from COVID-19 as being within 28 days of testing positive.5 We restricted our analysis to emergency admissions of persons age >18 years. If a patient had multiple emergency admissions, we restricted our analysis to the first admission to ensure comparability across hospitalizations and to best represent outcomes from the earliest onset of COVID-19.
We estimated a modified Poisson regression6 to predict death at 28 days, with month of admission, region, source of admission, age, deprivation, gender, ethnic group, and the 29 comorbidities in the Elixhauser comorbidity measure as variables in the regression.7 The derivation of each of these variables from the HES dataset is shown in Appendix Table 1.
Deprivation was measured by the Index of Multiple Deprivation, a methodology used widely within the UK to classify relative deprivation.8 To control for clustering, hospital system (known as Trust) was added as a random effect. Robust errors were estimated using the sandwich package.9 Modified Poisson regression was chosen in preference to the more common logistic regression because the coefficients can be interpreted as relative risks and not odds ratios. The model was fitted using R, version 4.0.3, geepack library.10 We carried out three sensitivity analyses, restricting to laboratory-confirmed COVID-19, length of stay ≥3 days, and primary respiratory disease.
For each month, we obtained a standardized mortality ratio (SMR) by fixing the month to the reference month of March 2020 and repredicting the outcome using the existing model. We calculated the ratio of the sum of observed and expected deaths (obtained from the model) in each month, comparing observed deaths to the number we would have expected had those patients been hospitalized in March. We then multiplied each period’s SMR by the March crude mortality to generate monthly adjusted mortality rates. We calculated Poisson confidence intervals around the SMR and used these to obtain confidence intervals for the adjusted rate. The binomial exact method was used to obtain confidence intervals for the crude rate. Multicollinearity was assessed using both the variance inflation factor (VIF) and the condition number test.7 All analyses used two-sided statistical tests, and we considered a P value < .05 to be statistically significant without adjustment for multiple testing. The study was exempt from UK National Research Ethics Committee approval because it involved secondary analysis of anonymized data.
RESULTS
The dataset included 115,643 emergency admissions from 179 healthcare systems, of which 103,202 were first admissions eligible for inclusion. A total of 592 patients were excluded due to missing demographic data (0.5%), resulting in 102,610 admissions included in the analysis. Peak hospitalizations occurred in late March to mid April, accounting for 44% of the hospitalizations (Table). Median length of stay for patients who died was 7 days (interquartile range, 3-12). The median age and number of Elixhauser comorbidities decreased in July. The proportion of men decreased between May and July.
The modified Poisson regression had a C statistic of 0.743 (95% CI, 0.740-0.746) (Appendix Table 4). The VIF and condition number test found no evidence of multicollinearity.11
Adjusted mortality decreased each month, from 33.4% in March to 17.4% in July (Figure). The relative risk of death declined progressively to a minimum of 0.52 (95% CI, 0.34-0.80) in July, compared to March.
Admission from another hospital and being female were associated with reduced risk of death. Admission from a skilled nursing facility and being >75 years were associated with increased risk of death. Ten of the 29 Elixhauser comorbidities were associated with increased risk of mortality (cardiac arrhythmia, peripheral vascular disease, other neurologic disorders, renal failure, lymphoma, metastatic cancer, solid tumor without metastasis, coagulopathy, fluid and electrolyte disorders, and anemia). Deprivation and ethnic group were not associated with death among hospitalized patients.
DISCUSSION
Our study of all emergency hospital admissions in England during the first wave of the COVID-19 pandemic demonstrated that, even after adjusting for patient comorbidity and risk factors, the mortality rate decreased by approximately half over the first 5 months. Although the demographics of hospitalized patients changed over that period (with both the median age and the number of comorbidities decreasing), this does not fully explain the decrease in mortality. It is therefore likely that the decrease is due, at least in part, to an improvement in treatment and/or a reduction in hospital strain.
For example, initially the use of corticosteroids was controversial, in part due to previous experience with severe acute respiratory syndrome and Middle East respiratory syndrome (in which a Cochrane review demonstrated no benefit but potential harm). However, this changed as a result of the Randomized Evaluation of Covid-19 Therapy (RECOVERY) trial,12 which showed a significant survival benefit.One of the positive defining characteristics of the COVID-19 pandemic has been the intensive collaborative research effort combined with the rapid dissemination and discussion of new management protocols. The RECOVERY trial randomly assigned >11,000 participants in just 3 months, amounting to approximately 15% of all patients hospitalized with COVID-19 in the UK. Its results were widely publicized via professional networks and rapidly adopted into widespread clinical practice.
Examples of other changes include a higher threshold for mechanical ventilation (and a lower threshold for noninvasive ventilation), increased clinician experience, and, potentially, a reduced viral load arising from increased social distancing and mask wearing. Finally, the hospitals and staff themselves were under enormous physical and mental strain in the early months from multiple factors, including unfamiliar working environments, the large-scale redeployment of inexperienced staff, and very high numbers of patients with an unfamiliar disease. These factors all lessened as the initial peak passed. It is therefore likely that the reduction in adjusted mortality we observed arises from a combination of all these factors, as well as other incremental benefits.
The factors associated with increased mortality risk in our study (increasing age, male gender, certain comorbidities, and frailty [with care home residency acting as a proxy in our study]) are consistent with multiple previous reports. Although not the focus of our analysis, we found no effect of ethnicity or deprivation on mortality. This is consistent with many US studies that demonstrate that the widely reported effect of these factors is likely due to differences in exposure to the disease. Once patients are hospitalized, adjusted mortality risks are similar across ethnic groups and deprivation levels.
The strengths of this study include complete capture of hospitalizations across all hospitals and areas in England. Likewise, linking the hospital data to death data from the Office for National Statistics allows complete capture of outcomes, irrespective of where the patient died. This is a significant strength compared to prior studies, which only included in-hospital mortality. Our results are therefore likely robust and a true observation of the mortality trend.
Limitations include the lack of physiologic and laboratory data; having these would have allowed us to adjust for disease severity on admission and strengthened the risk stratification. Likewise, although the complete national coverage is overall a significant strength, aggregating data from numerous areas that might be at different stages of local outbreaks, have different management strategies, and have differing data quality introduces its own biases.
Furthermore, these results predate the second wave in the UK, so we cannot distinguish whether the reduced mortality is due to improved treatment, a seasonal effect, evolution of the virus itself, or a reduction in the strain on hospitals.
CONCLUSION
This nationwide study indicates that, even after accounting for changing patient characteristics, the mortality of patients hospitalized with COVID-19 in England decreased significantly as the outbreak progressed. This is likely due to a combination of incremental treatment improvements.
The early phase of the coronavirus disease 2019 (COVID-19) pandemic in the United Kingdom (UK) was characterized by uncertainty as clinicians grappled to understand and manage an unfamiliar disease that affected very high numbers of patients amid radically evolving working environments, with little evidence to support their efforts. Early reports indicated high mortality in patients hospitalized with COVID-19.
As the disease became better understood, treatment evolved and the mortality appears to have decreased. For example, two recent papers, a national study of critical care patients in the UK and a single-site study from New York, have demonstrated a significant reduction in adjusted mortality between the pre- and post-peak periods.1,2 However, the UK study was restricted to patients receiving critical care, potentially introducing bias due to varying critical care admission thresholds over time, while the single-site US study may not be generalizable. Moreover, both studies measured only in-hospital mortality. It remains uncertain therefore whether overall mortality has decreased on a broad scale after accounting for changes in patient characteristics.
The aim of this study was to use a national dataset to assess the
METHODS
We conducted a retrospective, secondary analysis of English National Health Services (NHS) hospitals’ admissions of patients at least 18 years of age between March 1 and July 31, 2020. Data were obtained from the Hospital Episode Statistics (HES) admitted patient care dataset.3 This is an administrative dataset that contains data on diagnoses and procedures as well as organizational characteristics and patient demographics for all NHS activity in England. We included all patients with an International Statistical Classification of Diseases, Tenth Revision (ICD-10) diagnosis of U07.1 (COVID-19, virus identified) and U07.2 (COVID-19, virus not identified).
The primary outcome of death within 28 days of admission was obtained by linking to the Civil Registrations (Deaths) - Secondary Care Cut - Information dataset, which includes the date, place, and cause of death from the Office for National Statistics4 and which was complete through September 31, 2020. The time horizon of 28 days from admission was chosen to approximate the Public Health England definition of a death from COVID-19 as being within 28 days of testing positive.5 We restricted our analysis to emergency admissions of persons age >18 years. If a patient had multiple emergency admissions, we restricted our analysis to the first admission to ensure comparability across hospitalizations and to best represent outcomes from the earliest onset of COVID-19.
We estimated a modified Poisson regression6 to predict death at 28 days, with month of admission, region, source of admission, age, deprivation, gender, ethnic group, and the 29 comorbidities in the Elixhauser comorbidity measure as variables in the regression.7 The derivation of each of these variables from the HES dataset is shown in Appendix Table 1.
Deprivation was measured by the Index of Multiple Deprivation, a methodology used widely within the UK to classify relative deprivation.8 To control for clustering, hospital system (known as Trust) was added as a random effect. Robust errors were estimated using the sandwich package.9 Modified Poisson regression was chosen in preference to the more common logistic regression because the coefficients can be interpreted as relative risks and not odds ratios. The model was fitted using R, version 4.0.3, geepack library.10 We carried out three sensitivity analyses, restricting to laboratory-confirmed COVID-19, length of stay ≥3 days, and primary respiratory disease.
For each month, we obtained a standardized mortality ratio (SMR) by fixing the month to the reference month of March 2020 and repredicting the outcome using the existing model. We calculated the ratio of the sum of observed and expected deaths (obtained from the model) in each month, comparing observed deaths to the number we would have expected had those patients been hospitalized in March. We then multiplied each period’s SMR by the March crude mortality to generate monthly adjusted mortality rates. We calculated Poisson confidence intervals around the SMR and used these to obtain confidence intervals for the adjusted rate. The binomial exact method was used to obtain confidence intervals for the crude rate. Multicollinearity was assessed using both the variance inflation factor (VIF) and the condition number test.7 All analyses used two-sided statistical tests, and we considered a P value < .05 to be statistically significant without adjustment for multiple testing. The study was exempt from UK National Research Ethics Committee approval because it involved secondary analysis of anonymized data.
RESULTS
The dataset included 115,643 emergency admissions from 179 healthcare systems, of which 103,202 were first admissions eligible for inclusion. A total of 592 patients were excluded due to missing demographic data (0.5%), resulting in 102,610 admissions included in the analysis. Peak hospitalizations occurred in late March to mid April, accounting for 44% of the hospitalizations (Table). Median length of stay for patients who died was 7 days (interquartile range, 3-12). The median age and number of Elixhauser comorbidities decreased in July. The proportion of men decreased between May and July.
The modified Poisson regression had a C statistic of 0.743 (95% CI, 0.740-0.746) (Appendix Table 4). The VIF and condition number test found no evidence of multicollinearity.11
Adjusted mortality decreased each month, from 33.4% in March to 17.4% in July (Figure). The relative risk of death declined progressively to a minimum of 0.52 (95% CI, 0.34-0.80) in July, compared to March.
Admission from another hospital and being female were associated with reduced risk of death. Admission from a skilled nursing facility and being >75 years were associated with increased risk of death. Ten of the 29 Elixhauser comorbidities were associated with increased risk of mortality (cardiac arrhythmia, peripheral vascular disease, other neurologic disorders, renal failure, lymphoma, metastatic cancer, solid tumor without metastasis, coagulopathy, fluid and electrolyte disorders, and anemia). Deprivation and ethnic group were not associated with death among hospitalized patients.
DISCUSSION
Our study of all emergency hospital admissions in England during the first wave of the COVID-19 pandemic demonstrated that, even after adjusting for patient comorbidity and risk factors, the mortality rate decreased by approximately half over the first 5 months. Although the demographics of hospitalized patients changed over that period (with both the median age and the number of comorbidities decreasing), this does not fully explain the decrease in mortality. It is therefore likely that the decrease is due, at least in part, to an improvement in treatment and/or a reduction in hospital strain.
For example, initially the use of corticosteroids was controversial, in part due to previous experience with severe acute respiratory syndrome and Middle East respiratory syndrome (in which a Cochrane review demonstrated no benefit but potential harm). However, this changed as a result of the Randomized Evaluation of Covid-19 Therapy (RECOVERY) trial,12 which showed a significant survival benefit.One of the positive defining characteristics of the COVID-19 pandemic has been the intensive collaborative research effort combined with the rapid dissemination and discussion of new management protocols. The RECOVERY trial randomly assigned >11,000 participants in just 3 months, amounting to approximately 15% of all patients hospitalized with COVID-19 in the UK. Its results were widely publicized via professional networks and rapidly adopted into widespread clinical practice.
Examples of other changes include a higher threshold for mechanical ventilation (and a lower threshold for noninvasive ventilation), increased clinician experience, and, potentially, a reduced viral load arising from increased social distancing and mask wearing. Finally, the hospitals and staff themselves were under enormous physical and mental strain in the early months from multiple factors, including unfamiliar working environments, the large-scale redeployment of inexperienced staff, and very high numbers of patients with an unfamiliar disease. These factors all lessened as the initial peak passed. It is therefore likely that the reduction in adjusted mortality we observed arises from a combination of all these factors, as well as other incremental benefits.
The factors associated with increased mortality risk in our study (increasing age, male gender, certain comorbidities, and frailty [with care home residency acting as a proxy in our study]) are consistent with multiple previous reports. Although not the focus of our analysis, we found no effect of ethnicity or deprivation on mortality. This is consistent with many US studies that demonstrate that the widely reported effect of these factors is likely due to differences in exposure to the disease. Once patients are hospitalized, adjusted mortality risks are similar across ethnic groups and deprivation levels.
The strengths of this study include complete capture of hospitalizations across all hospitals and areas in England. Likewise, linking the hospital data to death data from the Office for National Statistics allows complete capture of outcomes, irrespective of where the patient died. This is a significant strength compared to prior studies, which only included in-hospital mortality. Our results are therefore likely robust and a true observation of the mortality trend.
Limitations include the lack of physiologic and laboratory data; having these would have allowed us to adjust for disease severity on admission and strengthened the risk stratification. Likewise, although the complete national coverage is overall a significant strength, aggregating data from numerous areas that might be at different stages of local outbreaks, have different management strategies, and have differing data quality introduces its own biases.
Furthermore, these results predate the second wave in the UK, so we cannot distinguish whether the reduced mortality is due to improved treatment, a seasonal effect, evolution of the virus itself, or a reduction in the strain on hospitals.
CONCLUSION
This nationwide study indicates that, even after accounting for changing patient characteristics, the mortality of patients hospitalized with COVID-19 in England decreased significantly as the outbreak progressed. This is likely due to a combination of incremental treatment improvements.
1. Horwitz LI, Jones SA, Cerfolio RJ, et al. Trends in COVID-19 risk-adjusted mortality rates. J Hosp Med. 2020;16(2):90-92. https://doi.org/10.12788/jhm.3552
2. Dennis JM, McGovern AP, Vollmer SJ, Mateen BA. Improving survival of critical care patients with coronavirus disease 2019 in England: a national cohort study, March to June 2020. Crit Care Med. 2021;49(2):209-214. https://doi.org/10.1097/CCM.0000000000004747
3. NHS Digital. Hospital Episode Statistics Data Dictionary. Published March 2018. Accessed October 15, 2020. https://digital.nhs.uk/data-and-information/data-tools-and-services/data-services/hospital-episode-statistics/hospital-episode-statistics-data-dictionary
4. NHS Digital. HES-ONS Linked Mortality Data Dictionary. Accessed October 15, 2020. https://digital.nhs.uk/binaries/content/assets/legacy/word/i/p/hes-ons_linked_mortality_data_dictionary_-_mar_20181.docx
5. Public Health England. Technical summary: Public Health England data series on deaths in people with COVID-19. Accessed November 11, 2020. https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/916035/RA_Technical_Summary_-_PHE_Data_Series_COVID_19_Deaths_20200812.pdf
6. Zou G. A modified Poisson regression approach to prospective studies with binary data. Am J Epidemiol. 2004;159(7):702-706. https://doi.org/10.1093/aje/kwh090
7. van Walraven C, Austin PC, Jennings A, et al. A modification of the Elixhauser comorbidity measures into a point system for hospital death using administrative data. Med Care. 2009;47(6):626-633. https://doi.org /10.1097/MLR.0b013e31819432e5
8. Ministry of Housing Communities & Local Government. The English Indices of Deprivation 2019 (IoD2019). Published September 26, 2020. Accessed January 15, 2021. https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/835115/IoD2019_Statistical_Release.pdf
9. Zeileis A. Object-oriented computation of sandwich estimators. J Stat Software. 2006;16:1-16. https://doi.org/10.18637/jss.v016.i09
10. Højsgaard S, Halekoh U, Yan J. The R package geepack for generalized estimating equations. J Stat Software. 2006;15:1-11. https://doi.org/10.18637/jss.v015.i02
11. Belsley DA, Kuh E, Welsch RE. Diagnostics: Identifying Influential Data and Sources of Collinearity. John Wiley & Sons; 1980.
12. RECOVERY Collaborative Group, Horby P, Lim WS, Emberson JR, et al. Dexamethasone in hospitalized patients with covid-19 - preliminary report. N Engl J Med. 2020:NEJMoa2021436. https://doi.org/10.1056/NEJMoa2021436
1. Horwitz LI, Jones SA, Cerfolio RJ, et al. Trends in COVID-19 risk-adjusted mortality rates. J Hosp Med. 2020;16(2):90-92. https://doi.org/10.12788/jhm.3552
2. Dennis JM, McGovern AP, Vollmer SJ, Mateen BA. Improving survival of critical care patients with coronavirus disease 2019 in England: a national cohort study, March to June 2020. Crit Care Med. 2021;49(2):209-214. https://doi.org/10.1097/CCM.0000000000004747
3. NHS Digital. Hospital Episode Statistics Data Dictionary. Published March 2018. Accessed October 15, 2020. https://digital.nhs.uk/data-and-information/data-tools-and-services/data-services/hospital-episode-statistics/hospital-episode-statistics-data-dictionary
4. NHS Digital. HES-ONS Linked Mortality Data Dictionary. Accessed October 15, 2020. https://digital.nhs.uk/binaries/content/assets/legacy/word/i/p/hes-ons_linked_mortality_data_dictionary_-_mar_20181.docx
5. Public Health England. Technical summary: Public Health England data series on deaths in people with COVID-19. Accessed November 11, 2020. https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/916035/RA_Technical_Summary_-_PHE_Data_Series_COVID_19_Deaths_20200812.pdf
6. Zou G. A modified Poisson regression approach to prospective studies with binary data. Am J Epidemiol. 2004;159(7):702-706. https://doi.org/10.1093/aje/kwh090
7. van Walraven C, Austin PC, Jennings A, et al. A modification of the Elixhauser comorbidity measures into a point system for hospital death using administrative data. Med Care. 2009;47(6):626-633. https://doi.org /10.1097/MLR.0b013e31819432e5
8. Ministry of Housing Communities & Local Government. The English Indices of Deprivation 2019 (IoD2019). Published September 26, 2020. Accessed January 15, 2021. https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/835115/IoD2019_Statistical_Release.pdf
9. Zeileis A. Object-oriented computation of sandwich estimators. J Stat Software. 2006;16:1-16. https://doi.org/10.18637/jss.v016.i09
10. Højsgaard S, Halekoh U, Yan J. The R package geepack for generalized estimating equations. J Stat Software. 2006;15:1-11. https://doi.org/10.18637/jss.v015.i02
11. Belsley DA, Kuh E, Welsch RE. Diagnostics: Identifying Influential Data and Sources of Collinearity. John Wiley & Sons; 1980.
12. RECOVERY Collaborative Group, Horby P, Lim WS, Emberson JR, et al. Dexamethasone in hospitalized patients with covid-19 - preliminary report. N Engl J Med. 2020:NEJMoa2021436. https://doi.org/10.1056/NEJMoa2021436
© 2021 Society of Hospital Medicine
New Author Guidelines for Addressing Race and Racism in the Journal of Hospital Medicine
We are committed to using our platform at the Journal of Hospital Medicine (JHM) to address inequities in healthcare delivery, policy, and research. Race was conceived as a mechanism of social division, leading to the false belief, propagated over time, of race as a biological variable.1 As a result, racism has contributed to the medical abuse and exploitation of Black and Brown communities and inequities in health status among racialized groups. We must abandon practices that perpetuate inequities and champion practices that resolve them. Racial health equity—the absence of unjust and avoidable health disparities among racialized groups—is unattainable if we continue to simply identify inequities without naming racism as a determinant of health. As a journal, our responsibility is to disseminate evidence-based manuscripts that reflect an understanding of race, racism, and health.
We have modified our author guidelines. First, we now require authors to clearly define race and provide justification for its inclusion in clinical case descriptions and study analyses. We aim to contribute to the necessary course correction as well as promote self-reflection on study design choices that propagate false notions of race as a biological concept and conclusions that reinforce race-based rather than race-conscious practices in medicine.2 Second, we expect authors to explicitly name racism and make a concerted effort to explore its role, identify its specific forms, and examine mutually reinforcing mechanisms of inequity that potentially contributed to study findings. Finally, we instruct authors to avoid the use of phrases like “patient mistrust,” which places blame for inequities on patients and their families and decouples mistrust from the fraught history of racism in medicine.
We must also acknowledge and reflect on our previous contributions to such inequity as authors, reviewers, and editors in order to learn and grow. Among the more than 2,000 articles published in JHM since its inception, only four included the term “racism.” Three of these articles are perspectives published in June 2020 and beyond. The only original research manuscript that directly addressed racism was a qualitative study of adults with sickle cell disease.3 The authors described study participants’ perspectives: “In contrast, the hospital experience during adulthood was often punctuated by bitter relationships with staff, and distrust over possible excessive use of opioids. Moreover, participants raised the possibility of racism in their interactions with hospital staff.” In this example, patients called out racism and its impact on their experience. We know JHM is not alone in falling woefully short in advancing our understanding of racism and racial health inequities. Each of us should identify missed opportunities to call out racism as a driver of racial health disparities in our own publications. We must act on these lessons regarding the ways in which racism infiltrates scientific publishing. We must use this awareness, along with our influence, voice, and collective power, to enact change for the betterment of our patients, their families, and the medical community.
We at JHM will contribute to uncovering and disseminating solutions to health inequities that result from racism. We are grateful to Boyd et al for their call to action and for providing a blueprint for improvement to those of us who write, review, and publish scholarly work.4
1. Roberts D. Fatal Invention: How Science, Politics, and Big Business Re-Create Race in the Twenty-First Century. 2nd ed. The New Press; 2012.
2. Cerdeña JP, Plaisime MV, Tsai J. From race-based to race-conscious medicine: how anti-racist uprisings call us to act. Lancet. 2020;396:1125-1128. https://doi:10.1016/S0140-6736(20)32076-6
3. Weisberg D, Balf-Soran G, Becker W, et al. “I’m talking about pain”: sickle cell disease patients with extremely high hospital use. J Hosp Med. 2013;8:42-46. https://doi:10.1002/jhm.1987
4. Boyd RW, Lindo EG, Weeks LD, McLemore MR. On racism: a new standard for publishing on racial health inequities. Health Affairs Blog. July 2, 2020. Accessed January 22, 2021. https://doi:10.1377/hblog20200630.939347 https://www.healthaffairs.org/do/10.1377/hblog20200630.939347/full/
We are committed to using our platform at the Journal of Hospital Medicine (JHM) to address inequities in healthcare delivery, policy, and research. Race was conceived as a mechanism of social division, leading to the false belief, propagated over time, of race as a biological variable.1 As a result, racism has contributed to the medical abuse and exploitation of Black and Brown communities and inequities in health status among racialized groups. We must abandon practices that perpetuate inequities and champion practices that resolve them. Racial health equity—the absence of unjust and avoidable health disparities among racialized groups—is unattainable if we continue to simply identify inequities without naming racism as a determinant of health. As a journal, our responsibility is to disseminate evidence-based manuscripts that reflect an understanding of race, racism, and health.
We have modified our author guidelines. First, we now require authors to clearly define race and provide justification for its inclusion in clinical case descriptions and study analyses. We aim to contribute to the necessary course correction as well as promote self-reflection on study design choices that propagate false notions of race as a biological concept and conclusions that reinforce race-based rather than race-conscious practices in medicine.2 Second, we expect authors to explicitly name racism and make a concerted effort to explore its role, identify its specific forms, and examine mutually reinforcing mechanisms of inequity that potentially contributed to study findings. Finally, we instruct authors to avoid the use of phrases like “patient mistrust,” which places blame for inequities on patients and their families and decouples mistrust from the fraught history of racism in medicine.
We must also acknowledge and reflect on our previous contributions to such inequity as authors, reviewers, and editors in order to learn and grow. Among the more than 2,000 articles published in JHM since its inception, only four included the term “racism.” Three of these articles are perspectives published in June 2020 and beyond. The only original research manuscript that directly addressed racism was a qualitative study of adults with sickle cell disease.3 The authors described study participants’ perspectives: “In contrast, the hospital experience during adulthood was often punctuated by bitter relationships with staff, and distrust over possible excessive use of opioids. Moreover, participants raised the possibility of racism in their interactions with hospital staff.” In this example, patients called out racism and its impact on their experience. We know JHM is not alone in falling woefully short in advancing our understanding of racism and racial health inequities. Each of us should identify missed opportunities to call out racism as a driver of racial health disparities in our own publications. We must act on these lessons regarding the ways in which racism infiltrates scientific publishing. We must use this awareness, along with our influence, voice, and collective power, to enact change for the betterment of our patients, their families, and the medical community.
We at JHM will contribute to uncovering and disseminating solutions to health inequities that result from racism. We are grateful to Boyd et al for their call to action and for providing a blueprint for improvement to those of us who write, review, and publish scholarly work.4
We are committed to using our platform at the Journal of Hospital Medicine (JHM) to address inequities in healthcare delivery, policy, and research. Race was conceived as a mechanism of social division, leading to the false belief, propagated over time, of race as a biological variable.1 As a result, racism has contributed to the medical abuse and exploitation of Black and Brown communities and inequities in health status among racialized groups. We must abandon practices that perpetuate inequities and champion practices that resolve them. Racial health equity—the absence of unjust and avoidable health disparities among racialized groups—is unattainable if we continue to simply identify inequities without naming racism as a determinant of health. As a journal, our responsibility is to disseminate evidence-based manuscripts that reflect an understanding of race, racism, and health.
We have modified our author guidelines. First, we now require authors to clearly define race and provide justification for its inclusion in clinical case descriptions and study analyses. We aim to contribute to the necessary course correction as well as promote self-reflection on study design choices that propagate false notions of race as a biological concept and conclusions that reinforce race-based rather than race-conscious practices in medicine.2 Second, we expect authors to explicitly name racism and make a concerted effort to explore its role, identify its specific forms, and examine mutually reinforcing mechanisms of inequity that potentially contributed to study findings. Finally, we instruct authors to avoid the use of phrases like “patient mistrust,” which places blame for inequities on patients and their families and decouples mistrust from the fraught history of racism in medicine.
We must also acknowledge and reflect on our previous contributions to such inequity as authors, reviewers, and editors in order to learn and grow. Among the more than 2,000 articles published in JHM since its inception, only four included the term “racism.” Three of these articles are perspectives published in June 2020 and beyond. The only original research manuscript that directly addressed racism was a qualitative study of adults with sickle cell disease.3 The authors described study participants’ perspectives: “In contrast, the hospital experience during adulthood was often punctuated by bitter relationships with staff, and distrust over possible excessive use of opioids. Moreover, participants raised the possibility of racism in their interactions with hospital staff.” In this example, patients called out racism and its impact on their experience. We know JHM is not alone in falling woefully short in advancing our understanding of racism and racial health inequities. Each of us should identify missed opportunities to call out racism as a driver of racial health disparities in our own publications. We must act on these lessons regarding the ways in which racism infiltrates scientific publishing. We must use this awareness, along with our influence, voice, and collective power, to enact change for the betterment of our patients, their families, and the medical community.
We at JHM will contribute to uncovering and disseminating solutions to health inequities that result from racism. We are grateful to Boyd et al for their call to action and for providing a blueprint for improvement to those of us who write, review, and publish scholarly work.4
1. Roberts D. Fatal Invention: How Science, Politics, and Big Business Re-Create Race in the Twenty-First Century. 2nd ed. The New Press; 2012.
2. Cerdeña JP, Plaisime MV, Tsai J. From race-based to race-conscious medicine: how anti-racist uprisings call us to act. Lancet. 2020;396:1125-1128. https://doi:10.1016/S0140-6736(20)32076-6
3. Weisberg D, Balf-Soran G, Becker W, et al. “I’m talking about pain”: sickle cell disease patients with extremely high hospital use. J Hosp Med. 2013;8:42-46. https://doi:10.1002/jhm.1987
4. Boyd RW, Lindo EG, Weeks LD, McLemore MR. On racism: a new standard for publishing on racial health inequities. Health Affairs Blog. July 2, 2020. Accessed January 22, 2021. https://doi:10.1377/hblog20200630.939347 https://www.healthaffairs.org/do/10.1377/hblog20200630.939347/full/
1. Roberts D. Fatal Invention: How Science, Politics, and Big Business Re-Create Race in the Twenty-First Century. 2nd ed. The New Press; 2012.
2. Cerdeña JP, Plaisime MV, Tsai J. From race-based to race-conscious medicine: how anti-racist uprisings call us to act. Lancet. 2020;396:1125-1128. https://doi:10.1016/S0140-6736(20)32076-6
3. Weisberg D, Balf-Soran G, Becker W, et al. “I’m talking about pain”: sickle cell disease patients with extremely high hospital use. J Hosp Med. 2013;8:42-46. https://doi:10.1002/jhm.1987
4. Boyd RW, Lindo EG, Weeks LD, McLemore MR. On racism: a new standard for publishing on racial health inequities. Health Affairs Blog. July 2, 2020. Accessed January 22, 2021. https://doi:10.1377/hblog20200630.939347 https://www.healthaffairs.org/do/10.1377/hblog20200630.939347/full/
© 2021 Society of Hospital Medicine
A Preoperative Transthoracic Echocardiography Protocol to Reduce Time to Hip Fracture Surgery
From Dignity Health Methodist Hospital of Sacramento Family Medicine Residency Program, Sacramento, CA (Dr. Oldach); Nationwide Children’s Hospital, Columbus, OH (Dr. Irwin); OhioHealth Research Institute, Columbus, OH (Dr. Pershing); Department of Clinical Transformation, OhioHealth, Columbus, OH (Dr. Zigmont and Dr. Gascon); and Department of Geriatrics, OhioHealth, Columbus, OH (Dr. Skully).
Abstract
Objective: An interdisciplinary committee was formed to identify factors contributing to surgical delays in urgent hip fracture repair at an urban, level 1 trauma center, with the goal of reducing preoperative time to less than 24 hours. Surgical optimization was identified as a primary, modifiable factor, as surgeons were reluctant to clear patients for surgery without cardiac consultation. Preoperative transthoracic echocardiogram (TTE) was recommended as a safe alternative to cardiac consultation in most patients.
Methods: A retrospective review was conducted for patients who underwent urgent hip fracture repair between January 2010 and April 2014 (n = 316). Time to medical optimization, time to surgery, hospital length of stay, and anesthesia induction were compared for 3 patient groups of interest: those who received (1) neither TTE nor cardiology consultation (ie, direct to surgery); (2) a preoperative TTE; or (3) preoperative cardiac consultation.
Results: There were significant between-group differences in medical optimization time (P = 0.001) and mean time to surgery (P < 0.001) when comparing the 3 groups of interest. Patients in the preoperative cardiac consult group had the longest times, followed by the TTE and direct-to-surgery groups. There were no differences in the type of induction agent used across treatment groups when stratifying by ejection fraction.
Conclusion: Preoperative TTE allows for decreased preoperative time compared to a cardiology consultation. It provides an easily implemented inter-departmental, intra-institutional intervention to decrease preoperative time in patients presenting with hip fractures.
Keywords: surgical delay; preoperative risk stratification; process improvement.
Hip fractures are common, expensive, and associated with poor outcomes.1,2 Ample literature suggests that morbidity, mortality, and cost of care may be reduced by minimizing surgical delays.3-5 While individual reports indicate mixed evidence, in a 2010 meta-analysis, surgery within 72 hours was associated with significant reductions in pneumonia and pressure sores, as well as a 19% reduction in all-cause mortality through 1 year.6 Additional reviews suggest evidence of improved patient outcomes (pain, length of stay, non-union, and/or mortality) when surgery occurs early, within 12 to 72 hours after injury.4,6,7 Regardless of the definition of “early surgery” used, surgical delay remains a challenge, often due to organizational factors, including admission day of the week and hospital staffing, and patient characteristics, such as comorbidities, echocardiographic findings, age, and insurance status.7-9
Among factors that contribute to surgical delays, the need for preoperative cardiovascular risk stratification is significantly modifiable.10 The American College of Cardiology (ACC)/American Heart Association (AHA) Task Force risk stratification framework for preoperative cardiac testing assists clinicians in determining surgical urgency, active cardiac conditions, cardiovascular risk factors, and functional capacity of each patient, and is well established for low- or intermediate-risk patients.11 Specifically, metabolic equivalents (METs) measurements are used to identify medically stable patients with good or excellent functional capacity versus poor or unknown functional status. Patients with ≥ 4 METs may proceed to surgery without further testing; patients with < 4 METs may either proceed with planned surgery or undergo additional testing. Patients with a perceived increased risk profile who require urgent or semi-urgent hip fracture repair may be confounded by disagreement about required preoperative cardiac testing.
At OhioHealth Grant Medical Center (GMC), an urban, level 1 trauma center, the consideration of further preoperative noninvasive testing frequently contributed to surgical delays. In 2009, hip fracture patients arriving to the emergency department (ED) waited an average of 51 hours before being transferred to the operating room (OR) for surgery. Presuming prompt surgery is both desirable and feasible, the Grant Hip Fracture Management Committee (GHFMC) was developed in order to expedite surgeries in hip fracture patients. The GHFMC recommended a preoperative hip fracture protocol, and the outcomes from protocol implementation are described in this article.
Methods
This study was approved by the OhioHealth Institutional Review Board, with a waiver of the informed consent requirement. Medical records from patients treated at GMC during the time period between January 2010 and April 2014 (ie, following implementation of GHFMC recommendations) were retrospectively reviewed to identify the extent to which the use of preoperative transthoracic echocardiography (TTE) reduced average time to surgery and total length of stay, compared to cardiac consultation. This chart review included 316 participants and was used to identify primary induction agent utilized, time to medical optimization, time to surgery, and total length of hospital stay.
Intervention
The GHFMC conducted a 9-month quality improvement project to decrease ED-to-OR time to less than 24 hours for hip fracture patients. The multidisciplinary committee consisted of physicians from orthopedic surgery, anesthesia, hospital medicine, and geriatrics, along with key administrators and nurse outcomes managers. While there is lack of complete clarity surrounding optimal surgical timing, the committee decided that surgery within 24 hours would be beneficial for the majority of patients and therefore was considered a prudent goal.
Based on identified barriers that contributed to surgical delays, several process improvement strategies were implemented, including admitting patients to the hospitalist service, engaging the orthopedic trauma team, and implementing pre- and postoperative protocols and order sets (eg, ED and pain management order sets). Specific emphasis was placed on establishing guidelines for determining medical optimization. In the absence of established guidelines, medical optimization was determined at the discretion of the attending physician. The necessity of preoperative cardiac assessment was based, in part, on physician concerns about determining safe anesthesia protocols and hemodynamically managing patients who may have occult heart disease, specifically those patients with low functional capacity (< 4 METs) and/or inability to accurately communicate their medical history.
Many hip fractures result from a fall, and it may be unclear whether the fall causing a fracture was purely mechanical or indicative of a distinct acute or chronic illness. As a result, many patients received cardiac consultations, with or without pharmacologic stress testing, adding another 24 to 36 hours to preoperative time. As invasive preoperative cardiac procedures generally result in surgical delays without improving outcomes,11 the committee recommended that clinicians reserve preoperative cardiac consultation for patients with active cardiac conditions.
In lieu of cardiac consultation, the committee suggested preoperative TTE. While use of TTE has not been shown to improve preoperative risk stratification in routine noncardiac surgeries, it has been shown to provide clinically useful information in patients at high risk for cardiac complications.11 There was consensus for incorporating preoperative TTE for several reasons: (1) the patients with hip fractures were not “routine,” and often did not have a reliable medical history; (2) a large percentage of patients had cardiac risk factors; (3) patients with undiagnosed aortic stenosis, severe left ventricular dysfunction, or severe pulmonary hypertension would likely have altered intraoperative fluid management; and (4) in supplanting cardiac consultations, TTE would likely expedite patients’ ED-to-OR times. Therefore, the GHFMC created a recommendation of ordering urgent TTE for patients who were unable to exercise at ≥ 4 METs but needed urgent hip fracture surgery.
In order to evaluate the success of the new protocol, the ED-to-OR times were calculated for a cohort of patients who underwent surgery for hip fracture following algorithm implementation.
Participants
A chart review was conducted for patients admitted to GMC between January 2010 and April 2014 for operative treatment of a hip fracture. Exclusion criteria included lack of radiologist-diagnosed hip fracture, periprosthetic hip fracture, or multiple traumas. Electronic patient charts were reviewed by investigators (KI and BO) using a standardized, electronic abstraction form for 3 groups of patients who (1) proceeded directly to planned surgery without TTE or cardiac consultation (direct-to-surgery group); (2) received preoperative TTE but not a cardiac consultation (TTE-only group); or (3) received preoperative cardiac consultation (cardiac consult group).
Measures
Demographics, comorbid conditions, MET score, anesthesia protocol, and in-hospital morbidity and mortality were extracted from medical charts. Medical optimization time was determined by the latest time stamp of 1 of the following: time that the final consulting specialist stated that the patient was stable for surgery; time that the hospitalist described the patient as being ready for surgery; time that the TTE report was certified by the reading cardiologist; or time that the hospitalist described the outcome of completed preoperative risk stratification. Time elapsed prior to medical optimization, surgery, and discharge were calculated using differences between the patient’s arrival date and time at the ED, first recorded time of medical optimization, surgical start time (from the surgical report), and discharge time, respectively.
To assess whether the TTE protocol may have affected anesthesia selection, the induction agent (etomidate or propofol) was abstracted from anesthesia reports and stratified by the ejection fraction of each patient: very low (≤ 35%), low (36%–50%), or normal (> 50%). Patients without an echocardiogram report were assumed to have a normal ejection fraction for this analysis.
Analysis
Descriptive statistics were produced using mean and standard deviation (SD) for continuous variables and frequency and percentage for categorical variables. To determine whether statistically significant differences existed between the 3 groups, the Kruskal-Wallis test was used to compare skewed continuous variables, and Pearson’s chi-square test was used to compare categorical variables. Due to differences in baseline patient characteristics across the 3 treatment groups, inverse probability weights were used to adjust for group differences (using a multinomial logit treatment model) while comparing differences in outcome variables. This modeling strategy does not rely on any assumptions for the distribution of the outcome variable. Covariates were considered for inclusion in the treatment or outcome model if they were significantly associated (P < 0.05) with the group variable. Additionally, anesthetic agent (etomidate or propofol) was compared across the treatment groups after stratifying by ejection fraction to identify whether any differences existed in anesthesia regimen. Patients who were prescribed more than 1 anesthetic agent (n = 2) or an agent that was not of interest were removed from the analysis (n = 13). Stata (version 14) was used for analysis. All other missing data with respect to the tested variables were omitted in the analysis for that variable. Any disagreements about abstraction were resolved through consensus between the investigators.
Results
A total of 316 cases met inclusion criteria, including 108 direct-to-surgery patients, 143 preoperative TTE patients, and 65 cardiac consult patients. Patient demographics and preoperative characteristics are shown in Table 1. The average age for all patients was 76.5 years of age (SD, 12.89; IQR, 34-97); however, direct-to-surgery patients were significantly (P < 0.001) younger (71.2 years; SD, 14.2; interquartile range [IQR], 34-95 years) than TTE-only patients (79.0 years; SD, 11.5; IQR, 35-97 years) and cardiac consult patients (79.57 years; SD, 10.63; IQR, 49-97 years). The majority of patients were female (69.9%) and experienced a fall prior to admission (94%). Almost three-fourths of patients had 1 or more cardiac risk factors (73.7%), including history of congestive heart failure (CHF; 19%), coronary artery disease (CAD; 26.3%), chronic obstructive pulmonary disease (COPD; 19.3%), or aortic stenosis (AS; 3.5%). Due to between-group differences in these comorbid conditions, confounding factors were adjusted for in subsequent analyses.
As shown in Table 2, before adjustment for confounding factors, there were significant between-group differences in medical optimization time for patients in all 3 groups. After adjustment for treatment differences using age and number of comorbid diseases, and medical optimization time differences using age and COPD, fewer between-group differences were statistically significant. Patients who received a cardiac consult had an 18.44-hour longer medical optimization time compared to patients who went directly to surgery (29.136 vs 10.696 hours; P = 0.001). Optimization remained approximately 5 hours longer for the TTE-only group than for the direct-to-surgery group; however, this difference was not significant (P = 0.075).
When comparing differences in ED-to-OR time for the 3 groups after adjusting the probability of treatment for age and the number of comorbid conditions, and adjusting the probability of ED-to-OR time for age, COPD, and CHF, significant differences remained in ED-to-OR times across all groups. Specifically, patients in the direct-to-surgery group experienced the shortest time (mean, 20.64 hours), compared to patients in the TTE-only group (mean, 26.32; P = 0.04) or patients in the cardiac consult group (mean, 36.08; P < 0.001). TTE-only patients had a longer time of 5.68 hours, compared to the direct-to-surgery group, and patients in the preoperative cardiac consult group were on average 15.44 hours longer than the direct-to-surgery group.
When comparing differences in the length of stay for the 3 groups before statistical adjustments, differences were observed; however, after removing the confounding factors related to treatment (age and CAD) and the outcome (age and the number of comorbid conditions), there were no statistically significant differences in the length of stay for the 3 groups. Average length of stay was 131 hours for direct-to-surgery patients, 142 hours for TTE-only patients, and 141 hours for cardiac consult patients.
The use of different anesthetic agents was compared for patients in the 3 groups. The majority of patients in the study (87.7%) were given propofol, and there were no differences after stratifying by ejection fraction (Table 3).
Discussion
The GHFMC was created to reduce surgical delays for hip fracture. Medical optimization was considered a primary, modifiable factor given that surgeons were reluctant to proceed without a cardiac consult. To address this gap, the committee recommended a preoperative TTE for patients with low or unknown functional status. This threshold provides a quick and easy method for stratifying patients who previously required risk stratification by a cardiologist, which often resulted in surgery delays.
In their recommendations for implementation of hip fracture quality improvement projects, the Geriatric Fracture Center emphasizes the importance of multidisciplinary physician leadership along with standardization of approach across patients.12 This recommendation is supported by increasing evidence that orthogeriatric collaborations are associated with decreased mortality and length of stay.13 The GHFMC and subsequent interventions reflect this approach, allowing for collaboration to identify cross-disciplinary procedural barriers to care. In our institution, addressing identified procedural barriers to care was associated with a reduction in the average time to surgery from 51 hours to 25.3 hours.
Multiple approaches have been attempted to decrease presurgical time in hip fracture patients in various settings. Prehospital interventions, such as providing ambulances with checklists and ability to bypass the ED, have not been shown to decrease time to surgery for hip fracture patients, though similar strategies have been successful in other conditions, such as stroke.14,15 In-hospital procedures, such as implementation of a hip fracture protocol and reduction of preoperative interventions, have more consistently been found to decrease time to surgery and in-hospital mortality.16,17 However, reduced delays have not been found universally. Luttrell and Nana found that preoperative TTE resulted in approximately 30.8-hour delays from the ED to OR, compared to patients who did not receive a preoperative TTE.18 However, in that study hospitalists used TTE at their own discretion, and there may have been confounding factors contributing to delays. When used as part of a protocol targeting patients with poor or unknown functional capacity, we believe that preoperative TTE results in modest surgical delays yet provides clinically useful information about each patient.
ACC/AHA preoperative guidelines were updated after we implemented our intervention and now recommend that patients with poor or unknown functional capacity in whom stress testing will not influence care proceed to surgery “according to guideline-directed medical care.”11 While routine use of preoperative evaluation of left ventricular function is not recommended, assessing left ventricular function may be reasonable for patients with heart failure with a change in clinical status. Guidelines also recommend that patients with clinically suspected valvular stenosis undergo preoperative echocardiography.11
Limitations
This study has several limitations. First, due to resource limitations, a substantial period of time elapsed between implementation of the new protocol and the analysis of the data set. That is, the hip fracture protocol assessed in this paper occurred from January 2010 through April 2014, and final analysis of the data set occurred in April 2020. This limitation precludes our ability to formally assess any pre- or post-protocol changes in patient outcomes. Second, randomization was not used to create groups that were balanced in differing health characteristics (ie, patients with noncardiac-related surgeries, patients in different age groups); however, the use of inverse probability treatment regression analysis was a way to statistically address these between-group differences. Moreover, this study is limited by the factors that were measured; unmeasured factors cannot be accounted for. Third, health care providers working at the hospital during this time were aware of the goal to decrease presurgical time, possibly creating exaggerated effects compared to a blinded trial. Finally, although this intervention is likely translatable to other centers, these results represent the experiences of a single level 1 trauma center and may not be replicable elsewhere.
Conclusion
Preoperative TTE in lieu of cardiac consultation has several advantages. First, it requires interdepartmental collaboration for implementation, but can be implemented through a single hospital or hospital system. Unlike prehospital interventions, preoperative urgent TTE for patients with low functional capacity does not require the support of emergency medical technicians, ambulance services, or other hospitals in the region. Second, while costs are associated with TTE, they are offset by a reduction in expensive consultations with specialists, surgical delays, and longer lengths of stay. Third, despite likely increased ED-to-OR times compared to no intervention, urgent TTE decreases time to surgery compared with cardiology consultation. Prior to the GHFMC, the ED-to-OR time at our institution was 51 hours. In contrast, the mean time following the GHFMC-led protocol was less than half that, at 25.3 hours (SD, 19.1 hours). In fact, nearly two-thirds (65.2%) of the patients evaluated in this study underwent surgery within 24 hours of admission. This improvement in presurgical time was attributed, in part, to the implementation of preoperative TTE over cardiology consultations.
Acknowledgments: The authors thank Jenny Williams, RN, who was instrumental in obtaining the data set for analysis, and Shauna Ayres, MPH, from the OhioHealth Research Institute, who provided writing and technical assistance.
Corresponding author: Robert Skully, MD, OhioHealth Family Medicine Grant, 290 East Town St., Columbus, OH 43215; [email protected].
Funding: This work was supported by the OhioHealth Summer Research Externship Program.
Financial disclosures: None.
1. Brauer CA, Coca-Perraillon M, Cutler DM, Rosen AB. Incidence and mortality of hip fractures in the United States. JAMA. 2009;302:1573-1579.
2. Lewiecki EM, Wright NC, Curtis JR, et al. Hip fracture trends in the United States 2002 to 2015. Osteoporos Int. 2018;29:717-722.
3. Colais P, Di Martino M, Fusco D, et al. The effect of early surgery after hip fracture on 1-year mortality. BMC Geriatr. 2015;15:141.
4. Nyholm AM, Gromov K, Palm H, et al. Time to surgery is associated with thirty-day and ninety-day mortality after proximal femoral fracture: a retrospective observational study on prospectively collected data from the Danish Fracture Database Collaborators. J Bone Joint Surg Am. 2015;97:1333-1339.
5. Judd KT, Christianson E. Expedited operative care of hip fractures results in significantly lower cost of treatment. Iowa Orthop J. 2015;35:62-64.
6. Simunovic N, Devereaux PJ, Sprague S, et al. Effect of early surgery after hip fracture on mortality and complications: systematic review and meta-analysis. CMAJ. 2010;182:1609-1616.
7. Ryan DJ, Yoshihara H, Yoneoka D, et al. Delay in hip fracture surgery: an analysis of patient-specific and hospital-specific risk factors. J Orthop Trauma. 2015;29:343-348.
8. Ricci WM, Brandt A, McAndrew C, Gardner MJ. Factors affecting delay to surgery and length of stay for patients with hip fracture. J Orthop Trauma. 2015;29:e109-e114.
9. Hagino T, Ochiai S, Senga S, et al. Efficacy of early surgery and causes of surgical delay in patients with hip fracture. J Orthop. 2015;12:142-146.
10. Rafiq A, Sklyar E, Bella JN. Cardiac evaluation and monitoring of patients undergoing noncardiac surgery. Health Serv Insights. 2017;9:1178632916686074.
11. Fleisher LA, Fleischmann KE, Auerbach AD, et al. 2014 ACC/AHA guideline on perioperative cardiovascular evaluation and management of patients undergoing noncardiac surgery: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines. J Am Coll Cardiol. 2014;64:e77-e137.
12. Basu N, Natour M, Mounasamy V, Kates SL. Geriatric hip fracture management: keys to providing a successful program. Eur J Trauma Emerg Surg. 2016;42:565-569.
13. Grigoryan KV, Javedan H, Rudolph JL. Orthogeriatric care models and outcomes in hip fracture patients: a systematic review and meta-analysis. J Orthop Trauma. 2014;28:e49-e55.
14. Tai YJ, Yan B. Minimising time to treatment: targeted strategies to minimise time to thrombolysis for acute ischaemic stroke. Intern Med J. 2013;43:1176-1182.
15. Larsson G, Stromberg RU, Rogmark C, Nilsdotter A. Prehospital fast track care for patients with hip fracture: Impact on time to surgery, hospital stay, post-operative complications and mortality a randomised, controlled trial. Injury. 2016;47:881-886.
16. Bohm E, Loucks L, Wittmeier K, et al. Reduced time to surgery improves mortality and length of stay following hip fracture: results from an intervention study in a Canadian health authority. Can J Surg. 2015;58:257-263.
17. Ventura C, Trombetti S, Pioli G, et al. Impact of multidisciplinary hip fracture program on timing of surgery in elderly patients. Osteoporos Int J. 2014;25:2591-2597.
18. Luttrell K, Nana A. Effect of preoperative transthoracic echocardiogram on mortality and surgical timing in elderly adults with hip fracture. J Am Geriatr Soc. 2015;63:2505-2509.
From Dignity Health Methodist Hospital of Sacramento Family Medicine Residency Program, Sacramento, CA (Dr. Oldach); Nationwide Children’s Hospital, Columbus, OH (Dr. Irwin); OhioHealth Research Institute, Columbus, OH (Dr. Pershing); Department of Clinical Transformation, OhioHealth, Columbus, OH (Dr. Zigmont and Dr. Gascon); and Department of Geriatrics, OhioHealth, Columbus, OH (Dr. Skully).
Abstract
Objective: An interdisciplinary committee was formed to identify factors contributing to surgical delays in urgent hip fracture repair at an urban, level 1 trauma center, with the goal of reducing preoperative time to less than 24 hours. Surgical optimization was identified as a primary, modifiable factor, as surgeons were reluctant to clear patients for surgery without cardiac consultation. Preoperative transthoracic echocardiogram (TTE) was recommended as a safe alternative to cardiac consultation in most patients.
Methods: A retrospective review was conducted for patients who underwent urgent hip fracture repair between January 2010 and April 2014 (n = 316). Time to medical optimization, time to surgery, hospital length of stay, and anesthesia induction were compared for 3 patient groups of interest: those who received (1) neither TTE nor cardiology consultation (ie, direct to surgery); (2) a preoperative TTE; or (3) preoperative cardiac consultation.
Results: There were significant between-group differences in medical optimization time (P = 0.001) and mean time to surgery (P < 0.001) when comparing the 3 groups of interest. Patients in the preoperative cardiac consult group had the longest times, followed by the TTE and direct-to-surgery groups. There were no differences in the type of induction agent used across treatment groups when stratifying by ejection fraction.
Conclusion: Preoperative TTE allows for decreased preoperative time compared to a cardiology consultation. It provides an easily implemented inter-departmental, intra-institutional intervention to decrease preoperative time in patients presenting with hip fractures.
Keywords: surgical delay; preoperative risk stratification; process improvement.
Hip fractures are common, expensive, and associated with poor outcomes.1,2 Ample literature suggests that morbidity, mortality, and cost of care may be reduced by minimizing surgical delays.3-5 While individual reports indicate mixed evidence, in a 2010 meta-analysis, surgery within 72 hours was associated with significant reductions in pneumonia and pressure sores, as well as a 19% reduction in all-cause mortality through 1 year.6 Additional reviews suggest evidence of improved patient outcomes (pain, length of stay, non-union, and/or mortality) when surgery occurs early, within 12 to 72 hours after injury.4,6,7 Regardless of the definition of “early surgery” used, surgical delay remains a challenge, often due to organizational factors, including admission day of the week and hospital staffing, and patient characteristics, such as comorbidities, echocardiographic findings, age, and insurance status.7-9
Among factors that contribute to surgical delays, the need for preoperative cardiovascular risk stratification is significantly modifiable.10 The American College of Cardiology (ACC)/American Heart Association (AHA) Task Force risk stratification framework for preoperative cardiac testing assists clinicians in determining surgical urgency, active cardiac conditions, cardiovascular risk factors, and functional capacity of each patient, and is well established for low- or intermediate-risk patients.11 Specifically, metabolic equivalents (METs) measurements are used to identify medically stable patients with good or excellent functional capacity versus poor or unknown functional status. Patients with ≥ 4 METs may proceed to surgery without further testing; patients with < 4 METs may either proceed with planned surgery or undergo additional testing. Patients with a perceived increased risk profile who require urgent or semi-urgent hip fracture repair may be confounded by disagreement about required preoperative cardiac testing.
At OhioHealth Grant Medical Center (GMC), an urban, level 1 trauma center, the consideration of further preoperative noninvasive testing frequently contributed to surgical delays. In 2009, hip fracture patients arriving to the emergency department (ED) waited an average of 51 hours before being transferred to the operating room (OR) for surgery. Presuming prompt surgery is both desirable and feasible, the Grant Hip Fracture Management Committee (GHFMC) was developed in order to expedite surgeries in hip fracture patients. The GHFMC recommended a preoperative hip fracture protocol, and the outcomes from protocol implementation are described in this article.
Methods
This study was approved by the OhioHealth Institutional Review Board, with a waiver of the informed consent requirement. Medical records from patients treated at GMC during the time period between January 2010 and April 2014 (ie, following implementation of GHFMC recommendations) were retrospectively reviewed to identify the extent to which the use of preoperative transthoracic echocardiography (TTE) reduced average time to surgery and total length of stay, compared to cardiac consultation. This chart review included 316 participants and was used to identify primary induction agent utilized, time to medical optimization, time to surgery, and total length of hospital stay.
Intervention
The GHFMC conducted a 9-month quality improvement project to decrease ED-to-OR time to less than 24 hours for hip fracture patients. The multidisciplinary committee consisted of physicians from orthopedic surgery, anesthesia, hospital medicine, and geriatrics, along with key administrators and nurse outcomes managers. While there is lack of complete clarity surrounding optimal surgical timing, the committee decided that surgery within 24 hours would be beneficial for the majority of patients and therefore was considered a prudent goal.
Based on identified barriers that contributed to surgical delays, several process improvement strategies were implemented, including admitting patients to the hospitalist service, engaging the orthopedic trauma team, and implementing pre- and postoperative protocols and order sets (eg, ED and pain management order sets). Specific emphasis was placed on establishing guidelines for determining medical optimization. In the absence of established guidelines, medical optimization was determined at the discretion of the attending physician. The necessity of preoperative cardiac assessment was based, in part, on physician concerns about determining safe anesthesia protocols and hemodynamically managing patients who may have occult heart disease, specifically those patients with low functional capacity (< 4 METs) and/or inability to accurately communicate their medical history.
Many hip fractures result from a fall, and it may be unclear whether the fall causing a fracture was purely mechanical or indicative of a distinct acute or chronic illness. As a result, many patients received cardiac consultations, with or without pharmacologic stress testing, adding another 24 to 36 hours to preoperative time. As invasive preoperative cardiac procedures generally result in surgical delays without improving outcomes,11 the committee recommended that clinicians reserve preoperative cardiac consultation for patients with active cardiac conditions.
In lieu of cardiac consultation, the committee suggested preoperative TTE. While use of TTE has not been shown to improve preoperative risk stratification in routine noncardiac surgeries, it has been shown to provide clinically useful information in patients at high risk for cardiac complications.11 There was consensus for incorporating preoperative TTE for several reasons: (1) the patients with hip fractures were not “routine,” and often did not have a reliable medical history; (2) a large percentage of patients had cardiac risk factors; (3) patients with undiagnosed aortic stenosis, severe left ventricular dysfunction, or severe pulmonary hypertension would likely have altered intraoperative fluid management; and (4) in supplanting cardiac consultations, TTE would likely expedite patients’ ED-to-OR times. Therefore, the GHFMC created a recommendation of ordering urgent TTE for patients who were unable to exercise at ≥ 4 METs but needed urgent hip fracture surgery.
In order to evaluate the success of the new protocol, the ED-to-OR times were calculated for a cohort of patients who underwent surgery for hip fracture following algorithm implementation.
Participants
A chart review was conducted for patients admitted to GMC between January 2010 and April 2014 for operative treatment of a hip fracture. Exclusion criteria included lack of radiologist-diagnosed hip fracture, periprosthetic hip fracture, or multiple traumas. Electronic patient charts were reviewed by investigators (KI and BO) using a standardized, electronic abstraction form for 3 groups of patients who (1) proceeded directly to planned surgery without TTE or cardiac consultation (direct-to-surgery group); (2) received preoperative TTE but not a cardiac consultation (TTE-only group); or (3) received preoperative cardiac consultation (cardiac consult group).
Measures
Demographics, comorbid conditions, MET score, anesthesia protocol, and in-hospital morbidity and mortality were extracted from medical charts. Medical optimization time was determined by the latest time stamp of 1 of the following: time that the final consulting specialist stated that the patient was stable for surgery; time that the hospitalist described the patient as being ready for surgery; time that the TTE report was certified by the reading cardiologist; or time that the hospitalist described the outcome of completed preoperative risk stratification. Time elapsed prior to medical optimization, surgery, and discharge were calculated using differences between the patient’s arrival date and time at the ED, first recorded time of medical optimization, surgical start time (from the surgical report), and discharge time, respectively.
To assess whether the TTE protocol may have affected anesthesia selection, the induction agent (etomidate or propofol) was abstracted from anesthesia reports and stratified by the ejection fraction of each patient: very low (≤ 35%), low (36%–50%), or normal (> 50%). Patients without an echocardiogram report were assumed to have a normal ejection fraction for this analysis.
Analysis
Descriptive statistics were produced using mean and standard deviation (SD) for continuous variables and frequency and percentage for categorical variables. To determine whether statistically significant differences existed between the 3 groups, the Kruskal-Wallis test was used to compare skewed continuous variables, and Pearson’s chi-square test was used to compare categorical variables. Due to differences in baseline patient characteristics across the 3 treatment groups, inverse probability weights were used to adjust for group differences (using a multinomial logit treatment model) while comparing differences in outcome variables. This modeling strategy does not rely on any assumptions for the distribution of the outcome variable. Covariates were considered for inclusion in the treatment or outcome model if they were significantly associated (P < 0.05) with the group variable. Additionally, anesthetic agent (etomidate or propofol) was compared across the treatment groups after stratifying by ejection fraction to identify whether any differences existed in anesthesia regimen. Patients who were prescribed more than 1 anesthetic agent (n = 2) or an agent that was not of interest were removed from the analysis (n = 13). Stata (version 14) was used for analysis. All other missing data with respect to the tested variables were omitted in the analysis for that variable. Any disagreements about abstraction were resolved through consensus between the investigators.
Results
A total of 316 cases met inclusion criteria, including 108 direct-to-surgery patients, 143 preoperative TTE patients, and 65 cardiac consult patients. Patient demographics and preoperative characteristics are shown in Table 1. The average age for all patients was 76.5 years of age (SD, 12.89; IQR, 34-97); however, direct-to-surgery patients were significantly (P < 0.001) younger (71.2 years; SD, 14.2; interquartile range [IQR], 34-95 years) than TTE-only patients (79.0 years; SD, 11.5; IQR, 35-97 years) and cardiac consult patients (79.57 years; SD, 10.63; IQR, 49-97 years). The majority of patients were female (69.9%) and experienced a fall prior to admission (94%). Almost three-fourths of patients had 1 or more cardiac risk factors (73.7%), including history of congestive heart failure (CHF; 19%), coronary artery disease (CAD; 26.3%), chronic obstructive pulmonary disease (COPD; 19.3%), or aortic stenosis (AS; 3.5%). Due to between-group differences in these comorbid conditions, confounding factors were adjusted for in subsequent analyses.
As shown in Table 2, before adjustment for confounding factors, there were significant between-group differences in medical optimization time for patients in all 3 groups. After adjustment for treatment differences using age and number of comorbid diseases, and medical optimization time differences using age and COPD, fewer between-group differences were statistically significant. Patients who received a cardiac consult had an 18.44-hour longer medical optimization time compared to patients who went directly to surgery (29.136 vs 10.696 hours; P = 0.001). Optimization remained approximately 5 hours longer for the TTE-only group than for the direct-to-surgery group; however, this difference was not significant (P = 0.075).
When comparing differences in ED-to-OR time for the 3 groups after adjusting the probability of treatment for age and the number of comorbid conditions, and adjusting the probability of ED-to-OR time for age, COPD, and CHF, significant differences remained in ED-to-OR times across all groups. Specifically, patients in the direct-to-surgery group experienced the shortest time (mean, 20.64 hours), compared to patients in the TTE-only group (mean, 26.32; P = 0.04) or patients in the cardiac consult group (mean, 36.08; P < 0.001). TTE-only patients had a longer time of 5.68 hours, compared to the direct-to-surgery group, and patients in the preoperative cardiac consult group were on average 15.44 hours longer than the direct-to-surgery group.
When comparing differences in the length of stay for the 3 groups before statistical adjustments, differences were observed; however, after removing the confounding factors related to treatment (age and CAD) and the outcome (age and the number of comorbid conditions), there were no statistically significant differences in the length of stay for the 3 groups. Average length of stay was 131 hours for direct-to-surgery patients, 142 hours for TTE-only patients, and 141 hours for cardiac consult patients.
The use of different anesthetic agents was compared for patients in the 3 groups. The majority of patients in the study (87.7%) were given propofol, and there were no differences after stratifying by ejection fraction (Table 3).
Discussion
The GHFMC was created to reduce surgical delays for hip fracture. Medical optimization was considered a primary, modifiable factor given that surgeons were reluctant to proceed without a cardiac consult. To address this gap, the committee recommended a preoperative TTE for patients with low or unknown functional status. This threshold provides a quick and easy method for stratifying patients who previously required risk stratification by a cardiologist, which often resulted in surgery delays.
In their recommendations for implementation of hip fracture quality improvement projects, the Geriatric Fracture Center emphasizes the importance of multidisciplinary physician leadership along with standardization of approach across patients.12 This recommendation is supported by increasing evidence that orthogeriatric collaborations are associated with decreased mortality and length of stay.13 The GHFMC and subsequent interventions reflect this approach, allowing for collaboration to identify cross-disciplinary procedural barriers to care. In our institution, addressing identified procedural barriers to care was associated with a reduction in the average time to surgery from 51 hours to 25.3 hours.
Multiple approaches have been attempted to decrease presurgical time in hip fracture patients in various settings. Prehospital interventions, such as providing ambulances with checklists and ability to bypass the ED, have not been shown to decrease time to surgery for hip fracture patients, though similar strategies have been successful in other conditions, such as stroke.14,15 In-hospital procedures, such as implementation of a hip fracture protocol and reduction of preoperative interventions, have more consistently been found to decrease time to surgery and in-hospital mortality.16,17 However, reduced delays have not been found universally. Luttrell and Nana found that preoperative TTE resulted in approximately 30.8-hour delays from the ED to OR, compared to patients who did not receive a preoperative TTE.18 However, in that study hospitalists used TTE at their own discretion, and there may have been confounding factors contributing to delays. When used as part of a protocol targeting patients with poor or unknown functional capacity, we believe that preoperative TTE results in modest surgical delays yet provides clinically useful information about each patient.
ACC/AHA preoperative guidelines were updated after we implemented our intervention and now recommend that patients with poor or unknown functional capacity in whom stress testing will not influence care proceed to surgery “according to guideline-directed medical care.”11 While routine use of preoperative evaluation of left ventricular function is not recommended, assessing left ventricular function may be reasonable for patients with heart failure with a change in clinical status. Guidelines also recommend that patients with clinically suspected valvular stenosis undergo preoperative echocardiography.11
Limitations
This study has several limitations. First, due to resource limitations, a substantial period of time elapsed between implementation of the new protocol and the analysis of the data set. That is, the hip fracture protocol assessed in this paper occurred from January 2010 through April 2014, and final analysis of the data set occurred in April 2020. This limitation precludes our ability to formally assess any pre- or post-protocol changes in patient outcomes. Second, randomization was not used to create groups that were balanced in differing health characteristics (ie, patients with noncardiac-related surgeries, patients in different age groups); however, the use of inverse probability treatment regression analysis was a way to statistically address these between-group differences. Moreover, this study is limited by the factors that were measured; unmeasured factors cannot be accounted for. Third, health care providers working at the hospital during this time were aware of the goal to decrease presurgical time, possibly creating exaggerated effects compared to a blinded trial. Finally, although this intervention is likely translatable to other centers, these results represent the experiences of a single level 1 trauma center and may not be replicable elsewhere.
Conclusion
Preoperative TTE in lieu of cardiac consultation has several advantages. First, it requires interdepartmental collaboration for implementation, but can be implemented through a single hospital or hospital system. Unlike prehospital interventions, preoperative urgent TTE for patients with low functional capacity does not require the support of emergency medical technicians, ambulance services, or other hospitals in the region. Second, while costs are associated with TTE, they are offset by a reduction in expensive consultations with specialists, surgical delays, and longer lengths of stay. Third, despite likely increased ED-to-OR times compared to no intervention, urgent TTE decreases time to surgery compared with cardiology consultation. Prior to the GHFMC, the ED-to-OR time at our institution was 51 hours. In contrast, the mean time following the GHFMC-led protocol was less than half that, at 25.3 hours (SD, 19.1 hours). In fact, nearly two-thirds (65.2%) of the patients evaluated in this study underwent surgery within 24 hours of admission. This improvement in presurgical time was attributed, in part, to the implementation of preoperative TTE over cardiology consultations.
Acknowledgments: The authors thank Jenny Williams, RN, who was instrumental in obtaining the data set for analysis, and Shauna Ayres, MPH, from the OhioHealth Research Institute, who provided writing and technical assistance.
Corresponding author: Robert Skully, MD, OhioHealth Family Medicine Grant, 290 East Town St., Columbus, OH 43215; [email protected].
Funding: This work was supported by the OhioHealth Summer Research Externship Program.
Financial disclosures: None.
From Dignity Health Methodist Hospital of Sacramento Family Medicine Residency Program, Sacramento, CA (Dr. Oldach); Nationwide Children’s Hospital, Columbus, OH (Dr. Irwin); OhioHealth Research Institute, Columbus, OH (Dr. Pershing); Department of Clinical Transformation, OhioHealth, Columbus, OH (Dr. Zigmont and Dr. Gascon); and Department of Geriatrics, OhioHealth, Columbus, OH (Dr. Skully).
Abstract
Objective: An interdisciplinary committee was formed to identify factors contributing to surgical delays in urgent hip fracture repair at an urban, level 1 trauma center, with the goal of reducing preoperative time to less than 24 hours. Surgical optimization was identified as a primary, modifiable factor, as surgeons were reluctant to clear patients for surgery without cardiac consultation. Preoperative transthoracic echocardiogram (TTE) was recommended as a safe alternative to cardiac consultation in most patients.
Methods: A retrospective review was conducted for patients who underwent urgent hip fracture repair between January 2010 and April 2014 (n = 316). Time to medical optimization, time to surgery, hospital length of stay, and anesthesia induction were compared for 3 patient groups of interest: those who received (1) neither TTE nor cardiology consultation (ie, direct to surgery); (2) a preoperative TTE; or (3) preoperative cardiac consultation.
Results: There were significant between-group differences in medical optimization time (P = 0.001) and mean time to surgery (P < 0.001) when comparing the 3 groups of interest. Patients in the preoperative cardiac consult group had the longest times, followed by the TTE and direct-to-surgery groups. There were no differences in the type of induction agent used across treatment groups when stratifying by ejection fraction.
Conclusion: Preoperative TTE allows for decreased preoperative time compared to a cardiology consultation. It provides an easily implemented inter-departmental, intra-institutional intervention to decrease preoperative time in patients presenting with hip fractures.
Keywords: surgical delay; preoperative risk stratification; process improvement.
Hip fractures are common, expensive, and associated with poor outcomes.1,2 Ample literature suggests that morbidity, mortality, and cost of care may be reduced by minimizing surgical delays.3-5 While individual reports indicate mixed evidence, in a 2010 meta-analysis, surgery within 72 hours was associated with significant reductions in pneumonia and pressure sores, as well as a 19% reduction in all-cause mortality through 1 year.6 Additional reviews suggest evidence of improved patient outcomes (pain, length of stay, non-union, and/or mortality) when surgery occurs early, within 12 to 72 hours after injury.4,6,7 Regardless of the definition of “early surgery” used, surgical delay remains a challenge, often due to organizational factors, including admission day of the week and hospital staffing, and patient characteristics, such as comorbidities, echocardiographic findings, age, and insurance status.7-9
Among factors that contribute to surgical delays, the need for preoperative cardiovascular risk stratification is significantly modifiable.10 The American College of Cardiology (ACC)/American Heart Association (AHA) Task Force risk stratification framework for preoperative cardiac testing assists clinicians in determining surgical urgency, active cardiac conditions, cardiovascular risk factors, and functional capacity of each patient, and is well established for low- or intermediate-risk patients.11 Specifically, metabolic equivalents (METs) measurements are used to identify medically stable patients with good or excellent functional capacity versus poor or unknown functional status. Patients with ≥ 4 METs may proceed to surgery without further testing; patients with < 4 METs may either proceed with planned surgery or undergo additional testing. Patients with a perceived increased risk profile who require urgent or semi-urgent hip fracture repair may be confounded by disagreement about required preoperative cardiac testing.
At OhioHealth Grant Medical Center (GMC), an urban, level 1 trauma center, the consideration of further preoperative noninvasive testing frequently contributed to surgical delays. In 2009, hip fracture patients arriving to the emergency department (ED) waited an average of 51 hours before being transferred to the operating room (OR) for surgery. Presuming prompt surgery is both desirable and feasible, the Grant Hip Fracture Management Committee (GHFMC) was developed in order to expedite surgeries in hip fracture patients. The GHFMC recommended a preoperative hip fracture protocol, and the outcomes from protocol implementation are described in this article.
Methods
This study was approved by the OhioHealth Institutional Review Board, with a waiver of the informed consent requirement. Medical records from patients treated at GMC during the time period between January 2010 and April 2014 (ie, following implementation of GHFMC recommendations) were retrospectively reviewed to identify the extent to which the use of preoperative transthoracic echocardiography (TTE) reduced average time to surgery and total length of stay, compared to cardiac consultation. This chart review included 316 participants and was used to identify primary induction agent utilized, time to medical optimization, time to surgery, and total length of hospital stay.
Intervention
The GHFMC conducted a 9-month quality improvement project to decrease ED-to-OR time to less than 24 hours for hip fracture patients. The multidisciplinary committee consisted of physicians from orthopedic surgery, anesthesia, hospital medicine, and geriatrics, along with key administrators and nurse outcomes managers. While there is lack of complete clarity surrounding optimal surgical timing, the committee decided that surgery within 24 hours would be beneficial for the majority of patients and therefore was considered a prudent goal.
Based on identified barriers that contributed to surgical delays, several process improvement strategies were implemented, including admitting patients to the hospitalist service, engaging the orthopedic trauma team, and implementing pre- and postoperative protocols and order sets (eg, ED and pain management order sets). Specific emphasis was placed on establishing guidelines for determining medical optimization. In the absence of established guidelines, medical optimization was determined at the discretion of the attending physician. The necessity of preoperative cardiac assessment was based, in part, on physician concerns about determining safe anesthesia protocols and hemodynamically managing patients who may have occult heart disease, specifically those patients with low functional capacity (< 4 METs) and/or inability to accurately communicate their medical history.
Many hip fractures result from a fall, and it may be unclear whether the fall causing a fracture was purely mechanical or indicative of a distinct acute or chronic illness. As a result, many patients received cardiac consultations, with or without pharmacologic stress testing, adding another 24 to 36 hours to preoperative time. As invasive preoperative cardiac procedures generally result in surgical delays without improving outcomes,11 the committee recommended that clinicians reserve preoperative cardiac consultation for patients with active cardiac conditions.
In lieu of cardiac consultation, the committee suggested preoperative TTE. While use of TTE has not been shown to improve preoperative risk stratification in routine noncardiac surgeries, it has been shown to provide clinically useful information in patients at high risk for cardiac complications.11 There was consensus for incorporating preoperative TTE for several reasons: (1) the patients with hip fractures were not “routine,” and often did not have a reliable medical history; (2) a large percentage of patients had cardiac risk factors; (3) patients with undiagnosed aortic stenosis, severe left ventricular dysfunction, or severe pulmonary hypertension would likely have altered intraoperative fluid management; and (4) in supplanting cardiac consultations, TTE would likely expedite patients’ ED-to-OR times. Therefore, the GHFMC created a recommendation of ordering urgent TTE for patients who were unable to exercise at ≥ 4 METs but needed urgent hip fracture surgery.
In order to evaluate the success of the new protocol, the ED-to-OR times were calculated for a cohort of patients who underwent surgery for hip fracture following algorithm implementation.
Participants
A chart review was conducted for patients admitted to GMC between January 2010 and April 2014 for operative treatment of a hip fracture. Exclusion criteria included lack of radiologist-diagnosed hip fracture, periprosthetic hip fracture, or multiple traumas. Electronic patient charts were reviewed by investigators (KI and BO) using a standardized, electronic abstraction form for 3 groups of patients who (1) proceeded directly to planned surgery without TTE or cardiac consultation (direct-to-surgery group); (2) received preoperative TTE but not a cardiac consultation (TTE-only group); or (3) received preoperative cardiac consultation (cardiac consult group).
Measures
Demographics, comorbid conditions, MET score, anesthesia protocol, and in-hospital morbidity and mortality were extracted from medical charts. Medical optimization time was determined by the latest time stamp of 1 of the following: time that the final consulting specialist stated that the patient was stable for surgery; time that the hospitalist described the patient as being ready for surgery; time that the TTE report was certified by the reading cardiologist; or time that the hospitalist described the outcome of completed preoperative risk stratification. Time elapsed prior to medical optimization, surgery, and discharge were calculated using differences between the patient’s arrival date and time at the ED, first recorded time of medical optimization, surgical start time (from the surgical report), and discharge time, respectively.
To assess whether the TTE protocol may have affected anesthesia selection, the induction agent (etomidate or propofol) was abstracted from anesthesia reports and stratified by the ejection fraction of each patient: very low (≤ 35%), low (36%–50%), or normal (> 50%). Patients without an echocardiogram report were assumed to have a normal ejection fraction for this analysis.
Analysis
Descriptive statistics were produced using mean and standard deviation (SD) for continuous variables and frequency and percentage for categorical variables. To determine whether statistically significant differences existed between the 3 groups, the Kruskal-Wallis test was used to compare skewed continuous variables, and Pearson’s chi-square test was used to compare categorical variables. Due to differences in baseline patient characteristics across the 3 treatment groups, inverse probability weights were used to adjust for group differences (using a multinomial logit treatment model) while comparing differences in outcome variables. This modeling strategy does not rely on any assumptions for the distribution of the outcome variable. Covariates were considered for inclusion in the treatment or outcome model if they were significantly associated (P < 0.05) with the group variable. Additionally, anesthetic agent (etomidate or propofol) was compared across the treatment groups after stratifying by ejection fraction to identify whether any differences existed in anesthesia regimen. Patients who were prescribed more than 1 anesthetic agent (n = 2) or an agent that was not of interest were removed from the analysis (n = 13). Stata (version 14) was used for analysis. All other missing data with respect to the tested variables were omitted in the analysis for that variable. Any disagreements about abstraction were resolved through consensus between the investigators.
Results
A total of 316 cases met inclusion criteria, including 108 direct-to-surgery patients, 143 preoperative TTE patients, and 65 cardiac consult patients. Patient demographics and preoperative characteristics are shown in Table 1. The average age for all patients was 76.5 years of age (SD, 12.89; IQR, 34-97); however, direct-to-surgery patients were significantly (P < 0.001) younger (71.2 years; SD, 14.2; interquartile range [IQR], 34-95 years) than TTE-only patients (79.0 years; SD, 11.5; IQR, 35-97 years) and cardiac consult patients (79.57 years; SD, 10.63; IQR, 49-97 years). The majority of patients were female (69.9%) and experienced a fall prior to admission (94%). Almost three-fourths of patients had 1 or more cardiac risk factors (73.7%), including history of congestive heart failure (CHF; 19%), coronary artery disease (CAD; 26.3%), chronic obstructive pulmonary disease (COPD; 19.3%), or aortic stenosis (AS; 3.5%). Due to between-group differences in these comorbid conditions, confounding factors were adjusted for in subsequent analyses.
As shown in Table 2, before adjustment for confounding factors, there were significant between-group differences in medical optimization time for patients in all 3 groups. After adjustment for treatment differences using age and number of comorbid diseases, and medical optimization time differences using age and COPD, fewer between-group differences were statistically significant. Patients who received a cardiac consult had an 18.44-hour longer medical optimization time compared to patients who went directly to surgery (29.136 vs 10.696 hours; P = 0.001). Optimization remained approximately 5 hours longer for the TTE-only group than for the direct-to-surgery group; however, this difference was not significant (P = 0.075).
When comparing differences in ED-to-OR time for the 3 groups after adjusting the probability of treatment for age and the number of comorbid conditions, and adjusting the probability of ED-to-OR time for age, COPD, and CHF, significant differences remained in ED-to-OR times across all groups. Specifically, patients in the direct-to-surgery group experienced the shortest time (mean, 20.64 hours), compared to patients in the TTE-only group (mean, 26.32; P = 0.04) or patients in the cardiac consult group (mean, 36.08; P < 0.001). TTE-only patients had a longer time of 5.68 hours, compared to the direct-to-surgery group, and patients in the preoperative cardiac consult group were on average 15.44 hours longer than the direct-to-surgery group.
When comparing differences in the length of stay for the 3 groups before statistical adjustments, differences were observed; however, after removing the confounding factors related to treatment (age and CAD) and the outcome (age and the number of comorbid conditions), there were no statistically significant differences in the length of stay for the 3 groups. Average length of stay was 131 hours for direct-to-surgery patients, 142 hours for TTE-only patients, and 141 hours for cardiac consult patients.
The use of different anesthetic agents was compared for patients in the 3 groups. The majority of patients in the study (87.7%) were given propofol, and there were no differences after stratifying by ejection fraction (Table 3).
Discussion
The GHFMC was created to reduce surgical delays for hip fracture. Medical optimization was considered a primary, modifiable factor given that surgeons were reluctant to proceed without a cardiac consult. To address this gap, the committee recommended a preoperative TTE for patients with low or unknown functional status. This threshold provides a quick and easy method for stratifying patients who previously required risk stratification by a cardiologist, which often resulted in surgery delays.
In their recommendations for implementation of hip fracture quality improvement projects, the Geriatric Fracture Center emphasizes the importance of multidisciplinary physician leadership along with standardization of approach across patients.12 This recommendation is supported by increasing evidence that orthogeriatric collaborations are associated with decreased mortality and length of stay.13 The GHFMC and subsequent interventions reflect this approach, allowing for collaboration to identify cross-disciplinary procedural barriers to care. In our institution, addressing identified procedural barriers to care was associated with a reduction in the average time to surgery from 51 hours to 25.3 hours.
Multiple approaches have been attempted to decrease presurgical time in hip fracture patients in various settings. Prehospital interventions, such as providing ambulances with checklists and ability to bypass the ED, have not been shown to decrease time to surgery for hip fracture patients, though similar strategies have been successful in other conditions, such as stroke.14,15 In-hospital procedures, such as implementation of a hip fracture protocol and reduction of preoperative interventions, have more consistently been found to decrease time to surgery and in-hospital mortality.16,17 However, reduced delays have not been found universally. Luttrell and Nana found that preoperative TTE resulted in approximately 30.8-hour delays from the ED to OR, compared to patients who did not receive a preoperative TTE.18 However, in that study hospitalists used TTE at their own discretion, and there may have been confounding factors contributing to delays. When used as part of a protocol targeting patients with poor or unknown functional capacity, we believe that preoperative TTE results in modest surgical delays yet provides clinically useful information about each patient.
ACC/AHA preoperative guidelines were updated after we implemented our intervention and now recommend that patients with poor or unknown functional capacity in whom stress testing will not influence care proceed to surgery “according to guideline-directed medical care.”11 While routine use of preoperative evaluation of left ventricular function is not recommended, assessing left ventricular function may be reasonable for patients with heart failure with a change in clinical status. Guidelines also recommend that patients with clinically suspected valvular stenosis undergo preoperative echocardiography.11
Limitations
This study has several limitations. First, due to resource limitations, a substantial period of time elapsed between implementation of the new protocol and the analysis of the data set. That is, the hip fracture protocol assessed in this paper occurred from January 2010 through April 2014, and final analysis of the data set occurred in April 2020. This limitation precludes our ability to formally assess any pre- or post-protocol changes in patient outcomes. Second, randomization was not used to create groups that were balanced in differing health characteristics (ie, patients with noncardiac-related surgeries, patients in different age groups); however, the use of inverse probability treatment regression analysis was a way to statistically address these between-group differences. Moreover, this study is limited by the factors that were measured; unmeasured factors cannot be accounted for. Third, health care providers working at the hospital during this time were aware of the goal to decrease presurgical time, possibly creating exaggerated effects compared to a blinded trial. Finally, although this intervention is likely translatable to other centers, these results represent the experiences of a single level 1 trauma center and may not be replicable elsewhere.
Conclusion
Preoperative TTE in lieu of cardiac consultation has several advantages. First, it requires interdepartmental collaboration for implementation, but can be implemented through a single hospital or hospital system. Unlike prehospital interventions, preoperative urgent TTE for patients with low functional capacity does not require the support of emergency medical technicians, ambulance services, or other hospitals in the region. Second, while costs are associated with TTE, they are offset by a reduction in expensive consultations with specialists, surgical delays, and longer lengths of stay. Third, despite likely increased ED-to-OR times compared to no intervention, urgent TTE decreases time to surgery compared with cardiology consultation. Prior to the GHFMC, the ED-to-OR time at our institution was 51 hours. In contrast, the mean time following the GHFMC-led protocol was less than half that, at 25.3 hours (SD, 19.1 hours). In fact, nearly two-thirds (65.2%) of the patients evaluated in this study underwent surgery within 24 hours of admission. This improvement in presurgical time was attributed, in part, to the implementation of preoperative TTE over cardiology consultations.
Acknowledgments: The authors thank Jenny Williams, RN, who was instrumental in obtaining the data set for analysis, and Shauna Ayres, MPH, from the OhioHealth Research Institute, who provided writing and technical assistance.
Corresponding author: Robert Skully, MD, OhioHealth Family Medicine Grant, 290 East Town St., Columbus, OH 43215; [email protected].
Funding: This work was supported by the OhioHealth Summer Research Externship Program.
Financial disclosures: None.
1. Brauer CA, Coca-Perraillon M, Cutler DM, Rosen AB. Incidence and mortality of hip fractures in the United States. JAMA. 2009;302:1573-1579.
2. Lewiecki EM, Wright NC, Curtis JR, et al. Hip fracture trends in the United States 2002 to 2015. Osteoporos Int. 2018;29:717-722.
3. Colais P, Di Martino M, Fusco D, et al. The effect of early surgery after hip fracture on 1-year mortality. BMC Geriatr. 2015;15:141.
4. Nyholm AM, Gromov K, Palm H, et al. Time to surgery is associated with thirty-day and ninety-day mortality after proximal femoral fracture: a retrospective observational study on prospectively collected data from the Danish Fracture Database Collaborators. J Bone Joint Surg Am. 2015;97:1333-1339.
5. Judd KT, Christianson E. Expedited operative care of hip fractures results in significantly lower cost of treatment. Iowa Orthop J. 2015;35:62-64.
6. Simunovic N, Devereaux PJ, Sprague S, et al. Effect of early surgery after hip fracture on mortality and complications: systematic review and meta-analysis. CMAJ. 2010;182:1609-1616.
7. Ryan DJ, Yoshihara H, Yoneoka D, et al. Delay in hip fracture surgery: an analysis of patient-specific and hospital-specific risk factors. J Orthop Trauma. 2015;29:343-348.
8. Ricci WM, Brandt A, McAndrew C, Gardner MJ. Factors affecting delay to surgery and length of stay for patients with hip fracture. J Orthop Trauma. 2015;29:e109-e114.
9. Hagino T, Ochiai S, Senga S, et al. Efficacy of early surgery and causes of surgical delay in patients with hip fracture. J Orthop. 2015;12:142-146.
10. Rafiq A, Sklyar E, Bella JN. Cardiac evaluation and monitoring of patients undergoing noncardiac surgery. Health Serv Insights. 2017;9:1178632916686074.
11. Fleisher LA, Fleischmann KE, Auerbach AD, et al. 2014 ACC/AHA guideline on perioperative cardiovascular evaluation and management of patients undergoing noncardiac surgery: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines. J Am Coll Cardiol. 2014;64:e77-e137.
12. Basu N, Natour M, Mounasamy V, Kates SL. Geriatric hip fracture management: keys to providing a successful program. Eur J Trauma Emerg Surg. 2016;42:565-569.
13. Grigoryan KV, Javedan H, Rudolph JL. Orthogeriatric care models and outcomes in hip fracture patients: a systematic review and meta-analysis. J Orthop Trauma. 2014;28:e49-e55.
14. Tai YJ, Yan B. Minimising time to treatment: targeted strategies to minimise time to thrombolysis for acute ischaemic stroke. Intern Med J. 2013;43:1176-1182.
15. Larsson G, Stromberg RU, Rogmark C, Nilsdotter A. Prehospital fast track care for patients with hip fracture: Impact on time to surgery, hospital stay, post-operative complications and mortality a randomised, controlled trial. Injury. 2016;47:881-886.
16. Bohm E, Loucks L, Wittmeier K, et al. Reduced time to surgery improves mortality and length of stay following hip fracture: results from an intervention study in a Canadian health authority. Can J Surg. 2015;58:257-263.
17. Ventura C, Trombetti S, Pioli G, et al. Impact of multidisciplinary hip fracture program on timing of surgery in elderly patients. Osteoporos Int J. 2014;25:2591-2597.
18. Luttrell K, Nana A. Effect of preoperative transthoracic echocardiogram on mortality and surgical timing in elderly adults with hip fracture. J Am Geriatr Soc. 2015;63:2505-2509.
1. Brauer CA, Coca-Perraillon M, Cutler DM, Rosen AB. Incidence and mortality of hip fractures in the United States. JAMA. 2009;302:1573-1579.
2. Lewiecki EM, Wright NC, Curtis JR, et al. Hip fracture trends in the United States 2002 to 2015. Osteoporos Int. 2018;29:717-722.
3. Colais P, Di Martino M, Fusco D, et al. The effect of early surgery after hip fracture on 1-year mortality. BMC Geriatr. 2015;15:141.
4. Nyholm AM, Gromov K, Palm H, et al. Time to surgery is associated with thirty-day and ninety-day mortality after proximal femoral fracture: a retrospective observational study on prospectively collected data from the Danish Fracture Database Collaborators. J Bone Joint Surg Am. 2015;97:1333-1339.
5. Judd KT, Christianson E. Expedited operative care of hip fractures results in significantly lower cost of treatment. Iowa Orthop J. 2015;35:62-64.
6. Simunovic N, Devereaux PJ, Sprague S, et al. Effect of early surgery after hip fracture on mortality and complications: systematic review and meta-analysis. CMAJ. 2010;182:1609-1616.
7. Ryan DJ, Yoshihara H, Yoneoka D, et al. Delay in hip fracture surgery: an analysis of patient-specific and hospital-specific risk factors. J Orthop Trauma. 2015;29:343-348.
8. Ricci WM, Brandt A, McAndrew C, Gardner MJ. Factors affecting delay to surgery and length of stay for patients with hip fracture. J Orthop Trauma. 2015;29:e109-e114.
9. Hagino T, Ochiai S, Senga S, et al. Efficacy of early surgery and causes of surgical delay in patients with hip fracture. J Orthop. 2015;12:142-146.
10. Rafiq A, Sklyar E, Bella JN. Cardiac evaluation and monitoring of patients undergoing noncardiac surgery. Health Serv Insights. 2017;9:1178632916686074.
11. Fleisher LA, Fleischmann KE, Auerbach AD, et al. 2014 ACC/AHA guideline on perioperative cardiovascular evaluation and management of patients undergoing noncardiac surgery: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines. J Am Coll Cardiol. 2014;64:e77-e137.
12. Basu N, Natour M, Mounasamy V, Kates SL. Geriatric hip fracture management: keys to providing a successful program. Eur J Trauma Emerg Surg. 2016;42:565-569.
13. Grigoryan KV, Javedan H, Rudolph JL. Orthogeriatric care models and outcomes in hip fracture patients: a systematic review and meta-analysis. J Orthop Trauma. 2014;28:e49-e55.
14. Tai YJ, Yan B. Minimising time to treatment: targeted strategies to minimise time to thrombolysis for acute ischaemic stroke. Intern Med J. 2013;43:1176-1182.
15. Larsson G, Stromberg RU, Rogmark C, Nilsdotter A. Prehospital fast track care for patients with hip fracture: Impact on time to surgery, hospital stay, post-operative complications and mortality a randomised, controlled trial. Injury. 2016;47:881-886.
16. Bohm E, Loucks L, Wittmeier K, et al. Reduced time to surgery improves mortality and length of stay following hip fracture: results from an intervention study in a Canadian health authority. Can J Surg. 2015;58:257-263.
17. Ventura C, Trombetti S, Pioli G, et al. Impact of multidisciplinary hip fracture program on timing of surgery in elderly patients. Osteoporos Int J. 2014;25:2591-2597.
18. Luttrell K, Nana A. Effect of preoperative transthoracic echocardiogram on mortality and surgical timing in elderly adults with hip fracture. J Am Geriatr Soc. 2015;63:2505-2509.
A Multi-Membership Approach for Attributing Patient-Level Outcomes to Providers in an Inpatient Setting
From Banner Health Corporation, Phoenix, AZ.
Background: Health care providers are routinely incentivized with pay-for-performance (P4P) metrics to increase the quality of care. In an inpatient setting, P4P models typically measure quality by attributing each patient’s outcome to a single provider even though many providers routinely care for the patient. This study investigates a new attribution approach aiming to distribute each outcome across all providers who provided care.
Methods: The methodology relies on a multi-membership model and is demonstrated in the Banner Health system using 3 clinical outcome measures (length of stay, 30-day readmissions, and mortality) and responses to 3 survey questions that measure a patient’s perception of their care. The new approach is compared to the “standard” method, which attributes each patient to only 1 provider.
Results: When ranking by clinical outcomes, both methods were concordant 72.1% to 82.1% of the time for top-half/bottom-half rankings, with a median percentile difference between 7 and 15. When ranking by survey scores, there was more agreement, with concordance between 84.1% and 86.6% and a median percentile difference between 11 and 13. Last, Pearson correlation coefficients of the paired percentiles ranged from 0.56 to 0.78.
Conclusion: The new approach provides a fairer solution when measuring provider performance.
Keywords: patient attribution; PAMM; PAPR; random effect model; pay for performance.
Providers practicing in hospitals are routinely evaluated based on their performance and, in many cases, are financially incentivized for a better-than-average performance within a pay-for-performance (P4P) model. The use of P4P models is based on the belief that they will “improve, motivate, and enhance providers to pursue aggressively and ultimately achieve the quality performance targets thus decreasing the number of medical errors with less malpractice events.”1 Although P4P models continue to be a movement in health care, they have been challenging to implement.
One concern involves the general quality of implementation, such as defining metrics and targets, setting payout amounts, managing technology and market conditions, and gauging the level of transparency to the provider.2 Another challenge, and the focus of this project, are concerns around measuring performance to avoid perceptions of unfairness. This concern can be minimized if the attribution is handled in a fairer way, by spreading it across all providers who affected the outcome, both in a positive or negative direction.3
To implement these models, the performance of providers needs to be measured and tracked periodically. This requires linking, or attributing, a patient’s outcome to a provider, which is almost always the attending or discharging provider (ie, a single provider).3 In this single-provider attribution approach, one provider will receive all the credit (good or bad) for their respective patients’ outcomes, even though the provider may have seen the patient only a fraction of the time during the hospitalization. Attributing outcomes—for example, length of stay (LOS), readmission rate, mortality rate, net promoter score (NPS)—using this approach reduces the validity of metrics designed to measure provider performance, especially in a rotating provider environment where many providers interact with and care for a patient. For example, the quality of providers’ interpersonal skills and competence were among the strongest determinants of patient satisfaction,4 but it is not credible that this is solely based on the last provider during a hospitalization.
Proportionally distributing the attribution of an outcome has been used successfully in other contexts. Typically, a statistical modeling approach using a multi-membership framework is used because it can handle the sometimes-complicated relationships within the hierarchy. It also allows for auxiliary variables to be introduced, which can help explain and control for exogenous effects.5-7 For example, in the education setting, standardized testing is administered to students at defined years of schooling: at grades 4, 8, and 10, for instance. The progress of students, measured as the academic gains between test years, are proportionally attributed to all the teachers who the student has had between the test years. These partial attributions are combined to evaluate an overall teacher performance.8,9
Although the multi-membership framework has been used in other industries, it has yet to be applied in measuring provider performance. The purpose of this project is to investigate the impact of using a multi-provider approach compared to the standard single-provider approach. The findings may lead to modifications in the way a provider’s performance is measured and, thus, how providers are compensated. A similar study investigated the impact of proportionally distributing patients’ outcomes across all rotating providers using a weighting method based on billing practices to measure the partial impact of each provider.3
This study is different in 2 fundamental ways. First, attribution is weighted based on the number of clinically documented interactions (via clinical notes) between a patient and all rotating providers during the hospitalization. Second, performance is measured via multi-membership models, which can estimate the effect (both positive and negative) that a provider has on an outcome, even when caring for a patient a fraction of the time during the hospitalization.
Methods
Setting
Banner Health is a non-profit, multi-hospital health care system across 6 states in the western United States that is uniquely positioned to study provider quality attribution models. It not only has a large number of providers and serves a broad patient population, but Banner Health also uses an instance of Cerner (Kansas City, MO), an enterprise-level electronic health record (EHR) system that connects all its facilities and allows for advanced analytics across its system.
For this study, we included only general medicine and surgery patients admitted and discharged from the inpatient setting between January 1, 2018, and December 31, 2018, who were between 18 and 89 years old at admission, and who had a LOS between 1 and 14 days. Visit- and patient-level data were collected from Cerner, while outcome data, and corresponding expected outcome data, were obtained from Premier, Inc. (Charlotte, NC) using their CareScience methodologies.10 To measure patient experience, response data were extracted from post-discharge surveys administered by InMoment (Salt Lake City, UT).
Provider Attribution Models
Provider Attribution by Physician of Record (PAPR). In the standard approach, denoted here as the PAPR model, 1 provider—typically the attending or discharging provider, which may be the same person—is attributed to the entire hospitalization. This provider is responsible for the patient’s care, and all patient outcomes are aggregated and attributed to the provider to gauge his or her performance. The PAPR model is the most popular form of attribution across many health care systems and is routinely used for P4P incentives.
In this study, the discharging provider was used when attributing hospitalizations using the PAPR model. Providers responsible for fewer than 12 discharges in the calendar year were excluded. Because of the directness of this type of attribution, the performance of 1 provider does not account for the performance of the other rotating providers during hospitalizations.
Provider Attribution by Multiple Membership (PAMM). In contrast, we introduce another attribution approach here that is designed to assign partial attribution to each provider who cares for the patient during the hospitalization. To aggregate the partial attributions, and possibly control for any exogenous or risk-based factors, a multiple-membership, or multi-member (MM), model is used. The MM model can measure the effect of a provider on an outcome even when the patient-to-provider relationship is complex, such as in a rotating provider environment.8
The purpose of this study is to compare attribution models and to determine whether there are meaningful differences between them. Therefore, for comparison purposes, the same discharging providers using the PAPR approach are eligible for the PAMM approach, so that both attribution models are using the same set of providers. All other providers are excluded because their performance would not be comparable to the PAPR approach.
While there are many ways to document provider-to-patient interactions, 2 methods are available in almost all health care systems. The first method is to link a provider’s billing charges to each patient-day combination. This approach limits the attribution to 1 provider per patient per day because multiple rotating providers cannot charge for the same patient-day combination.3 However, many providers interact with a patient on the same day, so using this approach excludes non-billed provider-to-patient interactions.
The second method, which was used in this study, relies on documented clinical notes within the EHR to determine how attribution is shared. In this approach, attribution is weighted based on the authorship of 3 types of eligible clinical notes: admitting history/physical notes (during admission), progress notes (during subsequent days), and discharge summary notes (during final discharge). This will (likely) result in many providers being linked to a patient on each day, which better reflects the clinical setting (Figure). Recently, clinical notes were used to attribute care of patients in an inpatient setting, and it was found that this approach provides a reliable way of tracking interactions and assigning ownership.11
The provider-level attribution weights are based on the share of authorships of eligible note types. Specifically, for each provider j, let aij be the total count of eligible note types for hospitalization i authored by provider j, and let ai be the overall total count of eligible note types for hospitalization i. Then the attribution weight is
(Eq. 1)
for hospitalization i and provider j. Note that ∑jwij = 1: in other words, the total attribution, summed across all providers, is constrained to be 1 for each hospitalization.
Patient Outcomes
Outcomes were chosen based on their routine use in health care systems as standards when evaluating provider performance. This study included 6 outcomes: inpatient LOS, inpatient mortality, 30-day inpatient readmission, and patient responses from 3 survey questions. These outcomes can be collected without any manual chart reviews, and therefore are viewed as objective outcomes of provider performance.
Each outcome was aggregated for each provider using both attribution methods independently. For the PAPR method, observed-to-expected (OE) indices for LOS, mortality, and readmissions were calculated along with average patient survey scores. For the PAMM method, provider-level random effects from the fitted models were used. In both cases, the calculated measures were used for ranking purposes when determining top (or bottom) providers for each outcome.
Individual Provider Metrics for the PAPR Method
Inpatient LOS Index. Hospital inpatient LOS was measured as the number of days between admission date and discharge date. For each hospital visit, an expected LOS was determined using Premier’s CareScience Analytics (CSA) risk-adjustment methodology.10 The CSA methodology for LOS incorporates a patient’s clinical history, demographics, and visit-related administrative information.
Let nj be the number of hospitalizations attributed to provider j. Let oij and eij be the observed and expected LOS, respectively, for hospitalization i = 1,…,nj attributed to provider j. Then the inpatient LOS index for provider j is Lj = ∑ioij⁄∑ieij.
Inpatient Mortality Index. Inpatient mortality was defined as the death of the patient during hospitalization. For each hospitalization, an expected mortality probability was determined using Premier’s CSA risk-adjustment methodology.10 The CSA methodology for mortality incorporates a patient’s demographics and comorbidities.
Just as before, let nj be the number of hospitalizations attributed to provider j. Let mij = 1 if the patient died during hospitalization i = 1, … , nj attributed to provider j; mij = 0 otherwise. Let pij(m) be the corresponding expected mortality probability. Then the inpatient mortality index for provider j is Mj = ∑imij⁄∑ipij(m).
30-Day Inpatient Readmission Index. A 30-day inpatient readmission was defined as the event when a patient is discharged and readmits back into the inpatient setting within 30 days. The inclusion criteria defined by the Centers for Medicare and Medicaid Services (CMS) all-cause hospital-wide readmission measure was used and, consequently, planned readmissions were excluded.12 Readmissions could occur at any Banner hospital, including the same hospital. For each hospital visit, an expected readmission probability was derived using Premier’s CSA risk-adjustment methodology.10 The CSA methodology for readmissions incorporates a patient’s clinical history, demographics, and visit-related administrative information.
Let nj be the number of hospitalizations attributed to provider j. Let rij = 1 if the patient had a readmission following hospitalization i = 1, … , nj attributed to provider j; rij = 0 otherwise. Let pij(r) be the corresponding expected readmission probability. Then the 30-day inpatient readmission index for provider j is Rj = ∑irij ⁄∑ipij(r).
Patient Survey Scores. The satisfaction of the patient’s experience during hospitalization was measured via post-discharge surveys administered by InMoment. Two survey questions were selected because they related directly to a provider’s interaction with the patient: “My interactions with doctors were excellent” (Doctor) and “I received the best possible care” (Care). A third question, “I would recommend this hospital to my family and friends,” was selected as a proxy measure of the overall experience and, in the aggregate, is referred to as the net promoter score (NPS).13,14 The responses were measured on an 11-point Likert scale, ranging from “Strongly Disagree” (0) to “Strongly Agree” (10); “N/A” or missing responses were excluded.
The Likert responses were coded to 3 discrete values as follows: if the value was between 0 and 6, then -1 (ie, detractor); between 7 and 8 (ie, neutral), then 0; otherwise 1 (ie, promoter). Averaging these coded responses results in a patient survey score for each question. Specifically, let nj be the number of hospitalizations attributed to provider j in which the patient responded to the survey question. Let sij ∈{−1, 0, 1} be the coded response linked to hospitalization i = 1, … , nj attributed to provider j. Then the patient experience score for provider j is Sj = ∑isij⁄nj.
Handling Ties in Provider Performance Measures. Because ties can occur in the PAPR approach for all measures, a tie-breaking strategy is needed. For LOS indices, ties are less likely because their numerator is strictly greater than 0, and expected LOS values are typically distinct enough. Indeed, no ties were found in this study for LOS indices. However, mortality and readmission indices can routinely result in ties when the best possible index is achieved, such as 0 deaths or readmissions among attributed hospitalizations. To help differentiate between those indices in the PAPR approach, the total estimated risk (denominator) was utilized as a secondary scoring criterion.
Mortality and readmission metrics were addressed by sorting first by the outcome (mortality index), and second by the denominator (total estimated risk). For example, if provider A has the same mortality rate as provider B, then provider A would be ranked higher if the denominator was larger, indicating a higher risk for mortality.
Similarly, it was very common for providers to have the same overall average rating for a survey question. Therefore, the denominator (number of respondents) was used to break ties. However, the denominator sorting was bidirectional. For example, if the tied score was positive (more promoters than detractors) for providers A and B, then provider A would be ranked higher if the denominator was larger. Conversely, if the tied score between providers A and B was neutral or negative (more detractors than promoters), then provider A would be ranked lower if the denominator was larger.
Individual Provider Metrics for the PAMM Method
For the PAMM method, model-based metrics were derived using a MM model.8 Specifically, let J be the number of rotating providers in a health care system. Let Yi be an outcome of interest from hospitalization i, X1i, …, Xpi be fixed effects or covariates, and ß1, …, ßp be the coefficients for the respective covariates. Then the generalized MM statistical model is
(Eq. 2)
where g(μi ) is a link function between the mean of the outcome, μi, and its linear predictor, ß0, is the marginal intercept, wij represents the attribution weight of provider j on hospitalization i (described in Equation 1), and γj represents the random effect of provider j on the outcome with γj~N(0,σγ2).
For the mortality and readmission binary outcomes, logistic regression was performed using a logit link function, with the corresponding expected probability as the only fixed covariate. The expected probabilities were first converted into odds and then log-transformed before entering the model. For LOS, Poisson regression was performed using a log link function with the log-transformed expected LOS as the only fixed covariate. For coded patient experience responses, an ordered logistic regression was performed using a cumulative logit link function (no fixed effects were added).
MM Model-based Metrics. Each fitted MM model produces a predicted random effect for each provider. The provider-specific random effects can be interpreted as the unobserved influence of each provider on the outcome after controlling for any fixed effect included in the model. Therefore, the provider-specific random effects were used to evaluate the relative provider performance, which is analogous to the individual provider-level metrics used in the PAPR method.
Measuring provider performance using a MM model is more flexible and robust to outliers compared to the standard approach using OE indices or simple averages. First, although not investigated here, the effect of patient-, visit-, provider-, and/or temporal-level covariates can be controlled when evaluating provider performance. For example, a patient’s socioeconomic status, a provider’s workload, and seasonal factors can be added to the MM model. These external factors are not accounted for in OE indices.
Another advantage of using predicted random effects is the concept of “shrinkage.” The process of estimating random effects inherently accounts for small sample sizes (when providers do not treat a large enough sample of patients) and/or when there is a large ratio of patient variance to provider variance (for instance, when patient outcome variability is much higher compared to provider performance variability). In both cases, the estimation of the random effect is pulled ever closer to 0, signaling that the provider performance is closer to the population average. See Henderson15 and Mood16 for further details.
In contrast, OE indices can result in unreliable estimates when a provider has not cared for many patients. This is especially prevalent when the outcome is binary with a low probability of occurring, such as mortality. Indeed, provider-level mortality OE indices are routinely 0 when the patient counts are low, which skews performance rankings unfairly. Finally, OE indices also ignore the magnitude of the variance of an outcome between providers and patients, which can be large.
Comparison Methodology
In this study, we seek to compare the 2 methods of attribution, PAPR and PAMM, to determine whether there are meaningful differences between them when measuring provider performance. Using retrospective data described in the next section, each attribution method was used independently to derive provider-level metrics. To assess relative performance, percentiles were assigned to each provider based on their metric values so that, in the end, there were 2 percentile ranks for each provider for each metric.
Using these paired percentiles, we derived the following measures of concordance, similar to Herzke, Michtalik3: (1) the percent concordance measure—defined as the number of providers who landed in the top half (greater than the median) or bottom half under both attribution models—divided by the total number of providers; (2) the median of the absolute difference in percentiles under both attribution models; and (3) the Pearson correlation coefficient of the paired provider ranks. The first measure is a global measure of concordance between the 2 approaches and would be expected to be 50% by chance. The second measure gauges how an individual provider’s rank is affected by the change in attribution methodologies. The third measure is a statistical measure of linear correlation of the paired percentiles and was not included in the Herzke, Michtalik3 study.
All statistical analyses were performed on SAS (version 9.4; Cary, NC) and the MM models were fitted using PROC GLIMMIX with the EFFECT statement. The Banner Health Institutional Review Board approved this study.
Results
Descriptive Statistics
A total of
Multi-Membership Model Results
Table 3 displays the results after independently fitting MM models to each of the 3 clinical outcomes. Along with a marginal intercept, the only covariate in each model was the corresponding expected value after a transformation. This was added to use the same information that is typically used in OE indices, therefore allowing for a proper comparison between the 2 attribution methods. The provider-level variance represents the between-provider variation and measures the amount of influence providers have on the corresponding outcome after controlling for any covariates in the model. A provider-level variance of 0 would indicate that providers do not have any influence on the outcome. While the mortality and readmission model results can be compared to each other, the LOS model cannot given its different scale and transformation altogether.
The results in Table 3 suggest that each expected value covariate is highly correlated with its corresponding outcome, which is the anticipated conclusion given that they are constructed in this fashion. The estimated provider-level variances indicate that, after including an expected value in the model, providers have less of an influence on a patient’s LOS and likelihood of being readmitted. On the other hand, the results suggest that providers have much more influence on the likelihood of a patient dying in the hospital, even after controlling for an expected mortality covariate.
Table 4 shows the results after independently fitting MM-ordered logistic models to each of the 3 survey questions. The similar provider-level variances suggest that providers have the same influence on the patient’s perception of the quality of their interactions with the doctor (Doctor), the quality of the care they received (Care), and their likelihood to recommend a friend or family member to the hospital (NPS).
Comparison Results Between Both Attribution Methods
Table 5 compares the 2 attribution methods when ranking providers based on their performance on each outcome measure. The comparison metrics gauge how well the 2 methods agree overall (percent concordance), agree at the provider level (absolute percentile difference and interquartile range [IQR]), and how the paired percentiles linearly correlate to each other (Pearson correlation coefficient).
LOS, by a small margin, had the lowest concordance of clinical outcomes (72.1%), followed by mortality (75.9%) and readmissions (82.1%). Generally, the survey scores had higher percent concordance than the clinical outcome measures, with Doctor at 84.1%, Care at 85.9%, and NPS having the highest percent concordance at 86.6%. Given that by chance the percent concordance is expected to be 50%, there was notable discordance, especially with the clinical outcome measures. Using LOS performance as an example, one attribution methodology would rank a provider in the top half or bottom half, while the other attribution methodology would rank the same provider exactly the opposite way about 28% of the time.
The median absolute percentile difference between the 2 methods was more modest (between 7 and 15). Still, there were some providers whose performance ranking was heavily impacted by the attribution methodology that was used. This was especially true when evaluating performance for certain clinical measures, where the attribution method that was used could change the provider performance percentile by up to 90 levels.
The paired percentiles were positively correlated when ranking performance using any of the 6 measures. This suggests that both methodologies assess performance generally in the same direction, irrespective of the methodology and measure. We did not investigate more complex correlation measures and left this for future research.
It should be noted that ties occurred much more frequently with the PAPR method than when using PAMM and therefore required tie-breaking rules to be designed. Given the nature of OE indices, PAPR methodology is especially sensitive to ties whenever the measure includes counting the number of events (for example, mortality and readmissions) and whenever there are many providers with very few attributed patients. On the other hand, using the PAMM method is much more robust against ties given that the summation of all the weighted attributed outcomes will rarely result in ties, even with a nominal set of providers.
Discussion
In this study, the PAMM methodology was introduced and was used to assess relative provider performance on 3 clinical outcome measures and 3 patient survey scores. The new approach aims to distribute each outcome among all providers who provided care for a patient in an inpatient setting. Clinical notes were used to account for patient-to-provider interactions, and fitted MM statistical models were used to compute the effects that each provider had on each outcome. The provider effect was introduced as a random effect, and the set of predicted random effects was used to rank the performance of each provider.
The PAMM approach was compared to the more traditional methodology, PAPR, where each patient is attributed to only 1 provider: the discharging physician in this study. Using this approach, OE indices of clinical outcomes and averages of survey scores were used to rank the performance of each provider. This approach resulted in many ties, which were broken based on the number of hospitalizations, although other tie-breaking methods may be used in practice.
Both methodologies showed modest concordance with each other for the clinical outcomes, but higher concordance for the patient survey scores. This was also true when using the Pearson correlation coefficient to assess agreement. The 1 outcome measure that showed the least concordance and least linear correlation between methods was LOS, which would suggest that LOS performance is more sensitive to the attribution methodology that is used. However, it was the least concordant by a small margin.
Furthermore, although the medians of the absolute percentile differences were small, there were some providers who had large deviations, suggesting that some providers would move from being shown as high-performers to low-performers and vice versa based on the chosen attribution method. We investigated examples of this and determined that the root cause was the difference in effective sample sizes for a provider. For the PAPR method, the effective sample size is simply the number of hospitalizations attributed to the provider. For the PAMM method, the effective sample size is the sum of all non-zero weights across all hospitalizations where the provider cared for a patient. By and large, the PAMM methodology provides more information of the provider effect on an outcome than the PAPR approach because every provider-patient interaction is considered. For example, providers who do not routinely discharge patients, but often care for patients, will have rankings that differ dramatically between the 2 methods.
The PAMM methodology has many statistical advantages that were not fully utilized in this comparative study. For example, we did not include any covariates in the MM models except for the expected value of the outcome, when it was available. Still, it is known that other covariates can impact an outcome as well, such as the patient’s age, socioeconomic indicators, existing chronic conditions, and severity of hospitalization, which can be added to the MM models as fixed effects. In this way, the PAMM approach can control for these other covariates, which are typically outside of the control of providers but typically ignored using OE indices. Therefore, using the PAMM approach would provide a fairer comparison of provider performance.
Using the PAMM method, most providers had a large sample size to assess their performance once all the weighted interactions were included. Still, there were a few who did not care for many patients for a variety of reasons. In these scenarios, MM models “borrow” strength from other providers to produce a more robust predicted provider effect by using a weighted average between the overall population trend and the specific provider outcomes (see Rao and Molina17). As a result, PAMM is a more suitable approach when the sample sizes of patients attributed to providers can be small.
One of the most interesting findings of this study was the relative size of the provider-level variance to the size of the fixed effect in each model (Table 3). Except for mortality, these variances suggest that there is a small difference in performance from one provider to another. However, these should be interpreted as the variance when only 1 provider is involved in the care of a patient. When multiple providers are involved, using basic statistical theory, the overall provider-level variance will be σγ2 ∑wij2 (see Equation 2). For example, the estimated variance among providers for LOS was 0.03 (on a log scale), but, using the scenario in the Figure, the overall provider-level variance for this hospitalization will be 0.03 (0.3752 + 0.1252 + 0.52) = 0.012. Hence, the combined effect of providers on LOS is less than would be expected. Indeed, as more providers are involved with a patient’s care, the more their combined influence on an outcome is diluted.
In this study, the PAMM approach placed an equal weight on all provider-patient interactions via clinical note authorship, but that may not be optimal in some settings. For example, it may make more sense to set a higher weight on the provider who admitted or discharged the patient while placing less (or 0) weight on all other interactions. In the extreme, if the full weight were placed on 1 provider interaction (eg, during discharge, then the MM model would be reduced to a one-way random effects model. The flexibility of weighting interactions is a feature of the PAMM approach, but any weighting framework must be transparent to the providers before implementation.
Conclusion
This study demonstrates that the PAMM approach is a feasible option within a large health care organization. For P4P programs to be successful, providers must be able to trust that their performance will be fairly assessed and that all provider-patient interactions are captured to provide a full comparison amongst their peers. The PAMM methodology is one solution to spread the positive (and negative) outcomes across all providers who cared for a patient and therefore, if implemented, would add trust and fairness when measuring and assessing provider performance.
Acknowledgments: The authors thank Barrie Bradley for his support in the initial stages of this research and Dr. Syed Ismail Jafri for his help and support on the standard approaches of assessing and measuring provider performances.
Corresponding author: Rachel Ginn, MS, Banner Health Corporation, 2901 N. Central Ave., Phoenix, AZ 85012; [email protected].
Financial disclosures: None.
1. Abduljawad A, Al-Assaf AF. Incentives for better performance in health care. Sultan Qaboos Univ Med J. 2011;11:201-206.
2. Milstein R, Schreyoegg J. Pay for performance in the inpatient sector: a review of 34 P4P programs in 14 OECD countries. Health Policy. 2016;120:1125-1140.
3. Herzke CA, Michtalik HJ, Durkin N, et al. A method for attributing patient-level metrics to rotating providers in an inpatient setting. J Hosp Med. 2018;13:470-475.
4. Batbaatar E, Dorjdagva J, Luvsannyam A, Savino MM, Amenta P. Determinants of patient satisfaction: a systematic review. Perspect Public Health. 2017;137:89-101.
5. Ballou D, Sanders W, Wright P. Controlling for student background in value-added assessment of teachers. J Educ Behav Stat. 2004;29:37-65.
6. Hill PW, Goldstein H. Multilevel modeling of educational data with cross-classification and missing identification for units. J Educ Behav Stat. 1998;23:117-128.
7. Rasbash J, Browne WJ. Handbook of Multilevel Analysis. Springer; 2007.
8. Brown WJ, Goldstein H, Rasbash J. Multiple membership multiple classification (MMMC) models. Statistical Modeling. 2001;1:103-124.
9. Sanders WL, Horn SP. The Tennessee Value-Added Assessment System (TVAAS)—mixed-model methodology in educational assessment. J Pers Eval Educ. 1994;8:299-311.
10. Kroch EA, Duan M. CareScience Risk Assessment Model: Hospital Performance Measurement. Premier, Inc., 2008. http://www.ahrq.gov/qual/mortality/KrochRisk.htm
11. Schumacher DJ, Wu DTY, Meganathan K, et al. A feasibility study to attribute patients to primary interns on inpatient ward teams using electronic health record data. Acad Med. 2019;94:1376-1383.
12. Simoes J, Krumholz HM, Lin Z. Hospital-level 30-day risk-standardized readmission measure. Centers for Medicare & Medicaid Services, 2018. https://www.cms.gov/Medicare/Quality-Initiatives-Patient-Assessment-Instruments/HospitalQualityInits/Downloads/Hospital-Wide-All-Cause-Readmission-Updates.zip
13. Krol MW, de Boer D, Delnoij DM, Rademakers JJDJM. The Net Promoter Score: an asset to patient experience surveys? Health Expect. 2015;18:3099-3109.
14. Doyle C, Lennox L, Bell D. A systematic review of evidence on the links between patient experience and clinical safety and effectiveness. BMJ Open. 2013;3:e001570.
15. Henderson CR. Sire evaluation and genetic trends. J Anim Sci. 1973;1973:10-41.
16. Mood AM. Introduction to the Theory of Statistics. McGraw-Hill; 1950:xiii, 433-xiii.
17. Rao JNK, Molina I. Small Area Estimation. Wiley; 2015.
From Banner Health Corporation, Phoenix, AZ.
Background: Health care providers are routinely incentivized with pay-for-performance (P4P) metrics to increase the quality of care. In an inpatient setting, P4P models typically measure quality by attributing each patient’s outcome to a single provider even though many providers routinely care for the patient. This study investigates a new attribution approach aiming to distribute each outcome across all providers who provided care.
Methods: The methodology relies on a multi-membership model and is demonstrated in the Banner Health system using 3 clinical outcome measures (length of stay, 30-day readmissions, and mortality) and responses to 3 survey questions that measure a patient’s perception of their care. The new approach is compared to the “standard” method, which attributes each patient to only 1 provider.
Results: When ranking by clinical outcomes, both methods were concordant 72.1% to 82.1% of the time for top-half/bottom-half rankings, with a median percentile difference between 7 and 15. When ranking by survey scores, there was more agreement, with concordance between 84.1% and 86.6% and a median percentile difference between 11 and 13. Last, Pearson correlation coefficients of the paired percentiles ranged from 0.56 to 0.78.
Conclusion: The new approach provides a fairer solution when measuring provider performance.
Keywords: patient attribution; PAMM; PAPR; random effect model; pay for performance.
Providers practicing in hospitals are routinely evaluated based on their performance and, in many cases, are financially incentivized for a better-than-average performance within a pay-for-performance (P4P) model. The use of P4P models is based on the belief that they will “improve, motivate, and enhance providers to pursue aggressively and ultimately achieve the quality performance targets thus decreasing the number of medical errors with less malpractice events.”1 Although P4P models continue to be a movement in health care, they have been challenging to implement.
One concern involves the general quality of implementation, such as defining metrics and targets, setting payout amounts, managing technology and market conditions, and gauging the level of transparency to the provider.2 Another challenge, and the focus of this project, are concerns around measuring performance to avoid perceptions of unfairness. This concern can be minimized if the attribution is handled in a fairer way, by spreading it across all providers who affected the outcome, both in a positive or negative direction.3
To implement these models, the performance of providers needs to be measured and tracked periodically. This requires linking, or attributing, a patient’s outcome to a provider, which is almost always the attending or discharging provider (ie, a single provider).3 In this single-provider attribution approach, one provider will receive all the credit (good or bad) for their respective patients’ outcomes, even though the provider may have seen the patient only a fraction of the time during the hospitalization. Attributing outcomes—for example, length of stay (LOS), readmission rate, mortality rate, net promoter score (NPS)—using this approach reduces the validity of metrics designed to measure provider performance, especially in a rotating provider environment where many providers interact with and care for a patient. For example, the quality of providers’ interpersonal skills and competence were among the strongest determinants of patient satisfaction,4 but it is not credible that this is solely based on the last provider during a hospitalization.
Proportionally distributing the attribution of an outcome has been used successfully in other contexts. Typically, a statistical modeling approach using a multi-membership framework is used because it can handle the sometimes-complicated relationships within the hierarchy. It also allows for auxiliary variables to be introduced, which can help explain and control for exogenous effects.5-7 For example, in the education setting, standardized testing is administered to students at defined years of schooling: at grades 4, 8, and 10, for instance. The progress of students, measured as the academic gains between test years, are proportionally attributed to all the teachers who the student has had between the test years. These partial attributions are combined to evaluate an overall teacher performance.8,9
Although the multi-membership framework has been used in other industries, it has yet to be applied in measuring provider performance. The purpose of this project is to investigate the impact of using a multi-provider approach compared to the standard single-provider approach. The findings may lead to modifications in the way a provider’s performance is measured and, thus, how providers are compensated. A similar study investigated the impact of proportionally distributing patients’ outcomes across all rotating providers using a weighting method based on billing practices to measure the partial impact of each provider.3
This study is different in 2 fundamental ways. First, attribution is weighted based on the number of clinically documented interactions (via clinical notes) between a patient and all rotating providers during the hospitalization. Second, performance is measured via multi-membership models, which can estimate the effect (both positive and negative) that a provider has on an outcome, even when caring for a patient a fraction of the time during the hospitalization.
Methods
Setting
Banner Health is a non-profit, multi-hospital health care system across 6 states in the western United States that is uniquely positioned to study provider quality attribution models. It not only has a large number of providers and serves a broad patient population, but Banner Health also uses an instance of Cerner (Kansas City, MO), an enterprise-level electronic health record (EHR) system that connects all its facilities and allows for advanced analytics across its system.
For this study, we included only general medicine and surgery patients admitted and discharged from the inpatient setting between January 1, 2018, and December 31, 2018, who were between 18 and 89 years old at admission, and who had a LOS between 1 and 14 days. Visit- and patient-level data were collected from Cerner, while outcome data, and corresponding expected outcome data, were obtained from Premier, Inc. (Charlotte, NC) using their CareScience methodologies.10 To measure patient experience, response data were extracted from post-discharge surveys administered by InMoment (Salt Lake City, UT).
Provider Attribution Models
Provider Attribution by Physician of Record (PAPR). In the standard approach, denoted here as the PAPR model, 1 provider—typically the attending or discharging provider, which may be the same person—is attributed to the entire hospitalization. This provider is responsible for the patient’s care, and all patient outcomes are aggregated and attributed to the provider to gauge his or her performance. The PAPR model is the most popular form of attribution across many health care systems and is routinely used for P4P incentives.
In this study, the discharging provider was used when attributing hospitalizations using the PAPR model. Providers responsible for fewer than 12 discharges in the calendar year were excluded. Because of the directness of this type of attribution, the performance of 1 provider does not account for the performance of the other rotating providers during hospitalizations.
Provider Attribution by Multiple Membership (PAMM). In contrast, we introduce another attribution approach here that is designed to assign partial attribution to each provider who cares for the patient during the hospitalization. To aggregate the partial attributions, and possibly control for any exogenous or risk-based factors, a multiple-membership, or multi-member (MM), model is used. The MM model can measure the effect of a provider on an outcome even when the patient-to-provider relationship is complex, such as in a rotating provider environment.8
The purpose of this study is to compare attribution models and to determine whether there are meaningful differences between them. Therefore, for comparison purposes, the same discharging providers using the PAPR approach are eligible for the PAMM approach, so that both attribution models are using the same set of providers. All other providers are excluded because their performance would not be comparable to the PAPR approach.
While there are many ways to document provider-to-patient interactions, 2 methods are available in almost all health care systems. The first method is to link a provider’s billing charges to each patient-day combination. This approach limits the attribution to 1 provider per patient per day because multiple rotating providers cannot charge for the same patient-day combination.3 However, many providers interact with a patient on the same day, so using this approach excludes non-billed provider-to-patient interactions.
The second method, which was used in this study, relies on documented clinical notes within the EHR to determine how attribution is shared. In this approach, attribution is weighted based on the authorship of 3 types of eligible clinical notes: admitting history/physical notes (during admission), progress notes (during subsequent days), and discharge summary notes (during final discharge). This will (likely) result in many providers being linked to a patient on each day, which better reflects the clinical setting (Figure). Recently, clinical notes were used to attribute care of patients in an inpatient setting, and it was found that this approach provides a reliable way of tracking interactions and assigning ownership.11
The provider-level attribution weights are based on the share of authorships of eligible note types. Specifically, for each provider j, let aij be the total count of eligible note types for hospitalization i authored by provider j, and let ai be the overall total count of eligible note types for hospitalization i. Then the attribution weight is
(Eq. 1)
for hospitalization i and provider j. Note that ∑jwij = 1: in other words, the total attribution, summed across all providers, is constrained to be 1 for each hospitalization.
Patient Outcomes
Outcomes were chosen based on their routine use in health care systems as standards when evaluating provider performance. This study included 6 outcomes: inpatient LOS, inpatient mortality, 30-day inpatient readmission, and patient responses from 3 survey questions. These outcomes can be collected without any manual chart reviews, and therefore are viewed as objective outcomes of provider performance.
Each outcome was aggregated for each provider using both attribution methods independently. For the PAPR method, observed-to-expected (OE) indices for LOS, mortality, and readmissions were calculated along with average patient survey scores. For the PAMM method, provider-level random effects from the fitted models were used. In both cases, the calculated measures were used for ranking purposes when determining top (or bottom) providers for each outcome.
Individual Provider Metrics for the PAPR Method
Inpatient LOS Index. Hospital inpatient LOS was measured as the number of days between admission date and discharge date. For each hospital visit, an expected LOS was determined using Premier’s CareScience Analytics (CSA) risk-adjustment methodology.10 The CSA methodology for LOS incorporates a patient’s clinical history, demographics, and visit-related administrative information.
Let nj be the number of hospitalizations attributed to provider j. Let oij and eij be the observed and expected LOS, respectively, for hospitalization i = 1,…,nj attributed to provider j. Then the inpatient LOS index for provider j is Lj = ∑ioij⁄∑ieij.
Inpatient Mortality Index. Inpatient mortality was defined as the death of the patient during hospitalization. For each hospitalization, an expected mortality probability was determined using Premier’s CSA risk-adjustment methodology.10 The CSA methodology for mortality incorporates a patient’s demographics and comorbidities.
Just as before, let nj be the number of hospitalizations attributed to provider j. Let mij = 1 if the patient died during hospitalization i = 1, … , nj attributed to provider j; mij = 0 otherwise. Let pij(m) be the corresponding expected mortality probability. Then the inpatient mortality index for provider j is Mj = ∑imij⁄∑ipij(m).
30-Day Inpatient Readmission Index. A 30-day inpatient readmission was defined as the event when a patient is discharged and readmits back into the inpatient setting within 30 days. The inclusion criteria defined by the Centers for Medicare and Medicaid Services (CMS) all-cause hospital-wide readmission measure was used and, consequently, planned readmissions were excluded.12 Readmissions could occur at any Banner hospital, including the same hospital. For each hospital visit, an expected readmission probability was derived using Premier’s CSA risk-adjustment methodology.10 The CSA methodology for readmissions incorporates a patient’s clinical history, demographics, and visit-related administrative information.
Let nj be the number of hospitalizations attributed to provider j. Let rij = 1 if the patient had a readmission following hospitalization i = 1, … , nj attributed to provider j; rij = 0 otherwise. Let pij(r) be the corresponding expected readmission probability. Then the 30-day inpatient readmission index for provider j is Rj = ∑irij ⁄∑ipij(r).
Patient Survey Scores. The satisfaction of the patient’s experience during hospitalization was measured via post-discharge surveys administered by InMoment. Two survey questions were selected because they related directly to a provider’s interaction with the patient: “My interactions with doctors were excellent” (Doctor) and “I received the best possible care” (Care). A third question, “I would recommend this hospital to my family and friends,” was selected as a proxy measure of the overall experience and, in the aggregate, is referred to as the net promoter score (NPS).13,14 The responses were measured on an 11-point Likert scale, ranging from “Strongly Disagree” (0) to “Strongly Agree” (10); “N/A” or missing responses were excluded.
The Likert responses were coded to 3 discrete values as follows: if the value was between 0 and 6, then -1 (ie, detractor); between 7 and 8 (ie, neutral), then 0; otherwise 1 (ie, promoter). Averaging these coded responses results in a patient survey score for each question. Specifically, let nj be the number of hospitalizations attributed to provider j in which the patient responded to the survey question. Let sij ∈{−1, 0, 1} be the coded response linked to hospitalization i = 1, … , nj attributed to provider j. Then the patient experience score for provider j is Sj = ∑isij⁄nj.
Handling Ties in Provider Performance Measures. Because ties can occur in the PAPR approach for all measures, a tie-breaking strategy is needed. For LOS indices, ties are less likely because their numerator is strictly greater than 0, and expected LOS values are typically distinct enough. Indeed, no ties were found in this study for LOS indices. However, mortality and readmission indices can routinely result in ties when the best possible index is achieved, such as 0 deaths or readmissions among attributed hospitalizations. To help differentiate between those indices in the PAPR approach, the total estimated risk (denominator) was utilized as a secondary scoring criterion.
Mortality and readmission metrics were addressed by sorting first by the outcome (mortality index), and second by the denominator (total estimated risk). For example, if provider A has the same mortality rate as provider B, then provider A would be ranked higher if the denominator was larger, indicating a higher risk for mortality.
Similarly, it was very common for providers to have the same overall average rating for a survey question. Therefore, the denominator (number of respondents) was used to break ties. However, the denominator sorting was bidirectional. For example, if the tied score was positive (more promoters than detractors) for providers A and B, then provider A would be ranked higher if the denominator was larger. Conversely, if the tied score between providers A and B was neutral or negative (more detractors than promoters), then provider A would be ranked lower if the denominator was larger.
Individual Provider Metrics for the PAMM Method
For the PAMM method, model-based metrics were derived using a MM model.8 Specifically, let J be the number of rotating providers in a health care system. Let Yi be an outcome of interest from hospitalization i, X1i, …, Xpi be fixed effects or covariates, and ß1, …, ßp be the coefficients for the respective covariates. Then the generalized MM statistical model is
(Eq. 2)
where g(μi ) is a link function between the mean of the outcome, μi, and its linear predictor, ß0, is the marginal intercept, wij represents the attribution weight of provider j on hospitalization i (described in Equation 1), and γj represents the random effect of provider j on the outcome with γj~N(0,σγ2).
For the mortality and readmission binary outcomes, logistic regression was performed using a logit link function, with the corresponding expected probability as the only fixed covariate. The expected probabilities were first converted into odds and then log-transformed before entering the model. For LOS, Poisson regression was performed using a log link function with the log-transformed expected LOS as the only fixed covariate. For coded patient experience responses, an ordered logistic regression was performed using a cumulative logit link function (no fixed effects were added).
MM Model-based Metrics. Each fitted MM model produces a predicted random effect for each provider. The provider-specific random effects can be interpreted as the unobserved influence of each provider on the outcome after controlling for any fixed effect included in the model. Therefore, the provider-specific random effects were used to evaluate the relative provider performance, which is analogous to the individual provider-level metrics used in the PAPR method.
Measuring provider performance using a MM model is more flexible and robust to outliers compared to the standard approach using OE indices or simple averages. First, although not investigated here, the effect of patient-, visit-, provider-, and/or temporal-level covariates can be controlled when evaluating provider performance. For example, a patient’s socioeconomic status, a provider’s workload, and seasonal factors can be added to the MM model. These external factors are not accounted for in OE indices.
Another advantage of using predicted random effects is the concept of “shrinkage.” The process of estimating random effects inherently accounts for small sample sizes (when providers do not treat a large enough sample of patients) and/or when there is a large ratio of patient variance to provider variance (for instance, when patient outcome variability is much higher compared to provider performance variability). In both cases, the estimation of the random effect is pulled ever closer to 0, signaling that the provider performance is closer to the population average. See Henderson15 and Mood16 for further details.
In contrast, OE indices can result in unreliable estimates when a provider has not cared for many patients. This is especially prevalent when the outcome is binary with a low probability of occurring, such as mortality. Indeed, provider-level mortality OE indices are routinely 0 when the patient counts are low, which skews performance rankings unfairly. Finally, OE indices also ignore the magnitude of the variance of an outcome between providers and patients, which can be large.
Comparison Methodology
In this study, we seek to compare the 2 methods of attribution, PAPR and PAMM, to determine whether there are meaningful differences between them when measuring provider performance. Using retrospective data described in the next section, each attribution method was used independently to derive provider-level metrics. To assess relative performance, percentiles were assigned to each provider based on their metric values so that, in the end, there were 2 percentile ranks for each provider for each metric.
Using these paired percentiles, we derived the following measures of concordance, similar to Herzke, Michtalik3: (1) the percent concordance measure—defined as the number of providers who landed in the top half (greater than the median) or bottom half under both attribution models—divided by the total number of providers; (2) the median of the absolute difference in percentiles under both attribution models; and (3) the Pearson correlation coefficient of the paired provider ranks. The first measure is a global measure of concordance between the 2 approaches and would be expected to be 50% by chance. The second measure gauges how an individual provider’s rank is affected by the change in attribution methodologies. The third measure is a statistical measure of linear correlation of the paired percentiles and was not included in the Herzke, Michtalik3 study.
All statistical analyses were performed on SAS (version 9.4; Cary, NC) and the MM models were fitted using PROC GLIMMIX with the EFFECT statement. The Banner Health Institutional Review Board approved this study.
Results
Descriptive Statistics
A total of
Multi-Membership Model Results
Table 3 displays the results after independently fitting MM models to each of the 3 clinical outcomes. Along with a marginal intercept, the only covariate in each model was the corresponding expected value after a transformation. This was added to use the same information that is typically used in OE indices, therefore allowing for a proper comparison between the 2 attribution methods. The provider-level variance represents the between-provider variation and measures the amount of influence providers have on the corresponding outcome after controlling for any covariates in the model. A provider-level variance of 0 would indicate that providers do not have any influence on the outcome. While the mortality and readmission model results can be compared to each other, the LOS model cannot given its different scale and transformation altogether.
The results in Table 3 suggest that each expected value covariate is highly correlated with its corresponding outcome, which is the anticipated conclusion given that they are constructed in this fashion. The estimated provider-level variances indicate that, after including an expected value in the model, providers have less of an influence on a patient’s LOS and likelihood of being readmitted. On the other hand, the results suggest that providers have much more influence on the likelihood of a patient dying in the hospital, even after controlling for an expected mortality covariate.
Table 4 shows the results after independently fitting MM-ordered logistic models to each of the 3 survey questions. The similar provider-level variances suggest that providers have the same influence on the patient’s perception of the quality of their interactions with the doctor (Doctor), the quality of the care they received (Care), and their likelihood to recommend a friend or family member to the hospital (NPS).
Comparison Results Between Both Attribution Methods
Table 5 compares the 2 attribution methods when ranking providers based on their performance on each outcome measure. The comparison metrics gauge how well the 2 methods agree overall (percent concordance), agree at the provider level (absolute percentile difference and interquartile range [IQR]), and how the paired percentiles linearly correlate to each other (Pearson correlation coefficient).
LOS, by a small margin, had the lowest concordance of clinical outcomes (72.1%), followed by mortality (75.9%) and readmissions (82.1%). Generally, the survey scores had higher percent concordance than the clinical outcome measures, with Doctor at 84.1%, Care at 85.9%, and NPS having the highest percent concordance at 86.6%. Given that by chance the percent concordance is expected to be 50%, there was notable discordance, especially with the clinical outcome measures. Using LOS performance as an example, one attribution methodology would rank a provider in the top half or bottom half, while the other attribution methodology would rank the same provider exactly the opposite way about 28% of the time.
The median absolute percentile difference between the 2 methods was more modest (between 7 and 15). Still, there were some providers whose performance ranking was heavily impacted by the attribution methodology that was used. This was especially true when evaluating performance for certain clinical measures, where the attribution method that was used could change the provider performance percentile by up to 90 levels.
The paired percentiles were positively correlated when ranking performance using any of the 6 measures. This suggests that both methodologies assess performance generally in the same direction, irrespective of the methodology and measure. We did not investigate more complex correlation measures and left this for future research.
It should be noted that ties occurred much more frequently with the PAPR method than when using PAMM and therefore required tie-breaking rules to be designed. Given the nature of OE indices, PAPR methodology is especially sensitive to ties whenever the measure includes counting the number of events (for example, mortality and readmissions) and whenever there are many providers with very few attributed patients. On the other hand, using the PAMM method is much more robust against ties given that the summation of all the weighted attributed outcomes will rarely result in ties, even with a nominal set of providers.
Discussion
In this study, the PAMM methodology was introduced and was used to assess relative provider performance on 3 clinical outcome measures and 3 patient survey scores. The new approach aims to distribute each outcome among all providers who provided care for a patient in an inpatient setting. Clinical notes were used to account for patient-to-provider interactions, and fitted MM statistical models were used to compute the effects that each provider had on each outcome. The provider effect was introduced as a random effect, and the set of predicted random effects was used to rank the performance of each provider.
The PAMM approach was compared to the more traditional methodology, PAPR, where each patient is attributed to only 1 provider: the discharging physician in this study. Using this approach, OE indices of clinical outcomes and averages of survey scores were used to rank the performance of each provider. This approach resulted in many ties, which were broken based on the number of hospitalizations, although other tie-breaking methods may be used in practice.
Both methodologies showed modest concordance with each other for the clinical outcomes, but higher concordance for the patient survey scores. This was also true when using the Pearson correlation coefficient to assess agreement. The 1 outcome measure that showed the least concordance and least linear correlation between methods was LOS, which would suggest that LOS performance is more sensitive to the attribution methodology that is used. However, it was the least concordant by a small margin.
Furthermore, although the medians of the absolute percentile differences were small, there were some providers who had large deviations, suggesting that some providers would move from being shown as high-performers to low-performers and vice versa based on the chosen attribution method. We investigated examples of this and determined that the root cause was the difference in effective sample sizes for a provider. For the PAPR method, the effective sample size is simply the number of hospitalizations attributed to the provider. For the PAMM method, the effective sample size is the sum of all non-zero weights across all hospitalizations where the provider cared for a patient. By and large, the PAMM methodology provides more information of the provider effect on an outcome than the PAPR approach because every provider-patient interaction is considered. For example, providers who do not routinely discharge patients, but often care for patients, will have rankings that differ dramatically between the 2 methods.
The PAMM methodology has many statistical advantages that were not fully utilized in this comparative study. For example, we did not include any covariates in the MM models except for the expected value of the outcome, when it was available. Still, it is known that other covariates can impact an outcome as well, such as the patient’s age, socioeconomic indicators, existing chronic conditions, and severity of hospitalization, which can be added to the MM models as fixed effects. In this way, the PAMM approach can control for these other covariates, which are typically outside of the control of providers but typically ignored using OE indices. Therefore, using the PAMM approach would provide a fairer comparison of provider performance.
Using the PAMM method, most providers had a large sample size to assess their performance once all the weighted interactions were included. Still, there were a few who did not care for many patients for a variety of reasons. In these scenarios, MM models “borrow” strength from other providers to produce a more robust predicted provider effect by using a weighted average between the overall population trend and the specific provider outcomes (see Rao and Molina17). As a result, PAMM is a more suitable approach when the sample sizes of patients attributed to providers can be small.
One of the most interesting findings of this study was the relative size of the provider-level variance to the size of the fixed effect in each model (Table 3). Except for mortality, these variances suggest that there is a small difference in performance from one provider to another. However, these should be interpreted as the variance when only 1 provider is involved in the care of a patient. When multiple providers are involved, using basic statistical theory, the overall provider-level variance will be σγ2 ∑wij2 (see Equation 2). For example, the estimated variance among providers for LOS was 0.03 (on a log scale), but, using the scenario in the Figure, the overall provider-level variance for this hospitalization will be 0.03 (0.3752 + 0.1252 + 0.52) = 0.012. Hence, the combined effect of providers on LOS is less than would be expected. Indeed, as more providers are involved with a patient’s care, the more their combined influence on an outcome is diluted.
In this study, the PAMM approach placed an equal weight on all provider-patient interactions via clinical note authorship, but that may not be optimal in some settings. For example, it may make more sense to set a higher weight on the provider who admitted or discharged the patient while placing less (or 0) weight on all other interactions. In the extreme, if the full weight were placed on 1 provider interaction (eg, during discharge, then the MM model would be reduced to a one-way random effects model. The flexibility of weighting interactions is a feature of the PAMM approach, but any weighting framework must be transparent to the providers before implementation.
Conclusion
This study demonstrates that the PAMM approach is a feasible option within a large health care organization. For P4P programs to be successful, providers must be able to trust that their performance will be fairly assessed and that all provider-patient interactions are captured to provide a full comparison amongst their peers. The PAMM methodology is one solution to spread the positive (and negative) outcomes across all providers who cared for a patient and therefore, if implemented, would add trust and fairness when measuring and assessing provider performance.
Acknowledgments: The authors thank Barrie Bradley for his support in the initial stages of this research and Dr. Syed Ismail Jafri for his help and support on the standard approaches of assessing and measuring provider performances.
Corresponding author: Rachel Ginn, MS, Banner Health Corporation, 2901 N. Central Ave., Phoenix, AZ 85012; [email protected].
Financial disclosures: None.
From Banner Health Corporation, Phoenix, AZ.
Background: Health care providers are routinely incentivized with pay-for-performance (P4P) metrics to increase the quality of care. In an inpatient setting, P4P models typically measure quality by attributing each patient’s outcome to a single provider even though many providers routinely care for the patient. This study investigates a new attribution approach aiming to distribute each outcome across all providers who provided care.
Methods: The methodology relies on a multi-membership model and is demonstrated in the Banner Health system using 3 clinical outcome measures (length of stay, 30-day readmissions, and mortality) and responses to 3 survey questions that measure a patient’s perception of their care. The new approach is compared to the “standard” method, which attributes each patient to only 1 provider.
Results: When ranking by clinical outcomes, both methods were concordant 72.1% to 82.1% of the time for top-half/bottom-half rankings, with a median percentile difference between 7 and 15. When ranking by survey scores, there was more agreement, with concordance between 84.1% and 86.6% and a median percentile difference between 11 and 13. Last, Pearson correlation coefficients of the paired percentiles ranged from 0.56 to 0.78.
Conclusion: The new approach provides a fairer solution when measuring provider performance.
Keywords: patient attribution; PAMM; PAPR; random effect model; pay for performance.
Providers practicing in hospitals are routinely evaluated based on their performance and, in many cases, are financially incentivized for a better-than-average performance within a pay-for-performance (P4P) model. The use of P4P models is based on the belief that they will “improve, motivate, and enhance providers to pursue aggressively and ultimately achieve the quality performance targets thus decreasing the number of medical errors with less malpractice events.”1 Although P4P models continue to be a movement in health care, they have been challenging to implement.
One concern involves the general quality of implementation, such as defining metrics and targets, setting payout amounts, managing technology and market conditions, and gauging the level of transparency to the provider.2 Another challenge, and the focus of this project, are concerns around measuring performance to avoid perceptions of unfairness. This concern can be minimized if the attribution is handled in a fairer way, by spreading it across all providers who affected the outcome, both in a positive or negative direction.3
To implement these models, the performance of providers needs to be measured and tracked periodically. This requires linking, or attributing, a patient’s outcome to a provider, which is almost always the attending or discharging provider (ie, a single provider).3 In this single-provider attribution approach, one provider will receive all the credit (good or bad) for their respective patients’ outcomes, even though the provider may have seen the patient only a fraction of the time during the hospitalization. Attributing outcomes—for example, length of stay (LOS), readmission rate, mortality rate, net promoter score (NPS)—using this approach reduces the validity of metrics designed to measure provider performance, especially in a rotating provider environment where many providers interact with and care for a patient. For example, the quality of providers’ interpersonal skills and competence were among the strongest determinants of patient satisfaction,4 but it is not credible that this is solely based on the last provider during a hospitalization.
Proportionally distributing the attribution of an outcome has been used successfully in other contexts. Typically, a statistical modeling approach using a multi-membership framework is used because it can handle the sometimes-complicated relationships within the hierarchy. It also allows for auxiliary variables to be introduced, which can help explain and control for exogenous effects.5-7 For example, in the education setting, standardized testing is administered to students at defined years of schooling: at grades 4, 8, and 10, for instance. The progress of students, measured as the academic gains between test years, are proportionally attributed to all the teachers who the student has had between the test years. These partial attributions are combined to evaluate an overall teacher performance.8,9
Although the multi-membership framework has been used in other industries, it has yet to be applied in measuring provider performance. The purpose of this project is to investigate the impact of using a multi-provider approach compared to the standard single-provider approach. The findings may lead to modifications in the way a provider’s performance is measured and, thus, how providers are compensated. A similar study investigated the impact of proportionally distributing patients’ outcomes across all rotating providers using a weighting method based on billing practices to measure the partial impact of each provider.3
This study is different in 2 fundamental ways. First, attribution is weighted based on the number of clinically documented interactions (via clinical notes) between a patient and all rotating providers during the hospitalization. Second, performance is measured via multi-membership models, which can estimate the effect (both positive and negative) that a provider has on an outcome, even when caring for a patient a fraction of the time during the hospitalization.
Methods
Setting
Banner Health is a non-profit, multi-hospital health care system across 6 states in the western United States that is uniquely positioned to study provider quality attribution models. It not only has a large number of providers and serves a broad patient population, but Banner Health also uses an instance of Cerner (Kansas City, MO), an enterprise-level electronic health record (EHR) system that connects all its facilities and allows for advanced analytics across its system.
For this study, we included only general medicine and surgery patients admitted and discharged from the inpatient setting between January 1, 2018, and December 31, 2018, who were between 18 and 89 years old at admission, and who had a LOS between 1 and 14 days. Visit- and patient-level data were collected from Cerner, while outcome data, and corresponding expected outcome data, were obtained from Premier, Inc. (Charlotte, NC) using their CareScience methodologies.10 To measure patient experience, response data were extracted from post-discharge surveys administered by InMoment (Salt Lake City, UT).
Provider Attribution Models
Provider Attribution by Physician of Record (PAPR). In the standard approach, denoted here as the PAPR model, 1 provider—typically the attending or discharging provider, which may be the same person—is attributed to the entire hospitalization. This provider is responsible for the patient’s care, and all patient outcomes are aggregated and attributed to the provider to gauge his or her performance. The PAPR model is the most popular form of attribution across many health care systems and is routinely used for P4P incentives.
In this study, the discharging provider was used when attributing hospitalizations using the PAPR model. Providers responsible for fewer than 12 discharges in the calendar year were excluded. Because of the directness of this type of attribution, the performance of 1 provider does not account for the performance of the other rotating providers during hospitalizations.
Provider Attribution by Multiple Membership (PAMM). In contrast, we introduce another attribution approach here that is designed to assign partial attribution to each provider who cares for the patient during the hospitalization. To aggregate the partial attributions, and possibly control for any exogenous or risk-based factors, a multiple-membership, or multi-member (MM), model is used. The MM model can measure the effect of a provider on an outcome even when the patient-to-provider relationship is complex, such as in a rotating provider environment.8
The purpose of this study is to compare attribution models and to determine whether there are meaningful differences between them. Therefore, for comparison purposes, the same discharging providers using the PAPR approach are eligible for the PAMM approach, so that both attribution models are using the same set of providers. All other providers are excluded because their performance would not be comparable to the PAPR approach.
While there are many ways to document provider-to-patient interactions, 2 methods are available in almost all health care systems. The first method is to link a provider’s billing charges to each patient-day combination. This approach limits the attribution to 1 provider per patient per day because multiple rotating providers cannot charge for the same patient-day combination.3 However, many providers interact with a patient on the same day, so using this approach excludes non-billed provider-to-patient interactions.
The second method, which was used in this study, relies on documented clinical notes within the EHR to determine how attribution is shared. In this approach, attribution is weighted based on the authorship of 3 types of eligible clinical notes: admitting history/physical notes (during admission), progress notes (during subsequent days), and discharge summary notes (during final discharge). This will (likely) result in many providers being linked to a patient on each day, which better reflects the clinical setting (Figure). Recently, clinical notes were used to attribute care of patients in an inpatient setting, and it was found that this approach provides a reliable way of tracking interactions and assigning ownership.11
The provider-level attribution weights are based on the share of authorships of eligible note types. Specifically, for each provider j, let aij be the total count of eligible note types for hospitalization i authored by provider j, and let ai be the overall total count of eligible note types for hospitalization i. Then the attribution weight is
(Eq. 1)
for hospitalization i and provider j. Note that ∑jwij = 1: in other words, the total attribution, summed across all providers, is constrained to be 1 for each hospitalization.
Patient Outcomes
Outcomes were chosen based on their routine use in health care systems as standards when evaluating provider performance. This study included 6 outcomes: inpatient LOS, inpatient mortality, 30-day inpatient readmission, and patient responses from 3 survey questions. These outcomes can be collected without any manual chart reviews, and therefore are viewed as objective outcomes of provider performance.
Each outcome was aggregated for each provider using both attribution methods independently. For the PAPR method, observed-to-expected (OE) indices for LOS, mortality, and readmissions were calculated along with average patient survey scores. For the PAMM method, provider-level random effects from the fitted models were used. In both cases, the calculated measures were used for ranking purposes when determining top (or bottom) providers for each outcome.
Individual Provider Metrics for the PAPR Method
Inpatient LOS Index. Hospital inpatient LOS was measured as the number of days between admission date and discharge date. For each hospital visit, an expected LOS was determined using Premier’s CareScience Analytics (CSA) risk-adjustment methodology.10 The CSA methodology for LOS incorporates a patient’s clinical history, demographics, and visit-related administrative information.
Let nj be the number of hospitalizations attributed to provider j. Let oij and eij be the observed and expected LOS, respectively, for hospitalization i = 1,…,nj attributed to provider j. Then the inpatient LOS index for provider j is Lj = ∑ioij⁄∑ieij.
Inpatient Mortality Index. Inpatient mortality was defined as the death of the patient during hospitalization. For each hospitalization, an expected mortality probability was determined using Premier’s CSA risk-adjustment methodology.10 The CSA methodology for mortality incorporates a patient’s demographics and comorbidities.
Just as before, let nj be the number of hospitalizations attributed to provider j. Let mij = 1 if the patient died during hospitalization i = 1, … , nj attributed to provider j; mij = 0 otherwise. Let pij(m) be the corresponding expected mortality probability. Then the inpatient mortality index for provider j is Mj = ∑imij⁄∑ipij(m).
30-Day Inpatient Readmission Index. A 30-day inpatient readmission was defined as the event when a patient is discharged and readmits back into the inpatient setting within 30 days. The inclusion criteria defined by the Centers for Medicare and Medicaid Services (CMS) all-cause hospital-wide readmission measure was used and, consequently, planned readmissions were excluded.12 Readmissions could occur at any Banner hospital, including the same hospital. For each hospital visit, an expected readmission probability was derived using Premier’s CSA risk-adjustment methodology.10 The CSA methodology for readmissions incorporates a patient’s clinical history, demographics, and visit-related administrative information.
Let nj be the number of hospitalizations attributed to provider j. Let rij = 1 if the patient had a readmission following hospitalization i = 1, … , nj attributed to provider j; rij = 0 otherwise. Let pij(r) be the corresponding expected readmission probability. Then the 30-day inpatient readmission index for provider j is Rj = ∑irij ⁄∑ipij(r).
Patient Survey Scores. The satisfaction of the patient’s experience during hospitalization was measured via post-discharge surveys administered by InMoment. Two survey questions were selected because they related directly to a provider’s interaction with the patient: “My interactions with doctors were excellent” (Doctor) and “I received the best possible care” (Care). A third question, “I would recommend this hospital to my family and friends,” was selected as a proxy measure of the overall experience and, in the aggregate, is referred to as the net promoter score (NPS).13,14 The responses were measured on an 11-point Likert scale, ranging from “Strongly Disagree” (0) to “Strongly Agree” (10); “N/A” or missing responses were excluded.
The Likert responses were coded to 3 discrete values as follows: if the value was between 0 and 6, then -1 (ie, detractor); between 7 and 8 (ie, neutral), then 0; otherwise 1 (ie, promoter). Averaging these coded responses results in a patient survey score for each question. Specifically, let nj be the number of hospitalizations attributed to provider j in which the patient responded to the survey question. Let sij ∈{−1, 0, 1} be the coded response linked to hospitalization i = 1, … , nj attributed to provider j. Then the patient experience score for provider j is Sj = ∑isij⁄nj.
Handling Ties in Provider Performance Measures. Because ties can occur in the PAPR approach for all measures, a tie-breaking strategy is needed. For LOS indices, ties are less likely because their numerator is strictly greater than 0, and expected LOS values are typically distinct enough. Indeed, no ties were found in this study for LOS indices. However, mortality and readmission indices can routinely result in ties when the best possible index is achieved, such as 0 deaths or readmissions among attributed hospitalizations. To help differentiate between those indices in the PAPR approach, the total estimated risk (denominator) was utilized as a secondary scoring criterion.
Mortality and readmission metrics were addressed by sorting first by the outcome (mortality index), and second by the denominator (total estimated risk). For example, if provider A has the same mortality rate as provider B, then provider A would be ranked higher if the denominator was larger, indicating a higher risk for mortality.
Similarly, it was very common for providers to have the same overall average rating for a survey question. Therefore, the denominator (number of respondents) was used to break ties. However, the denominator sorting was bidirectional. For example, if the tied score was positive (more promoters than detractors) for providers A and B, then provider A would be ranked higher if the denominator was larger. Conversely, if the tied score between providers A and B was neutral or negative (more detractors than promoters), then provider A would be ranked lower if the denominator was larger.
Individual Provider Metrics for the PAMM Method
For the PAMM method, model-based metrics were derived using a MM model.8 Specifically, let J be the number of rotating providers in a health care system. Let Yi be an outcome of interest from hospitalization i, X1i, …, Xpi be fixed effects or covariates, and ß1, …, ßp be the coefficients for the respective covariates. Then the generalized MM statistical model is
(Eq. 2)
where g(μi ) is a link function between the mean of the outcome, μi, and its linear predictor, ß0, is the marginal intercept, wij represents the attribution weight of provider j on hospitalization i (described in Equation 1), and γj represents the random effect of provider j on the outcome with γj~N(0,σγ2).
For the mortality and readmission binary outcomes, logistic regression was performed using a logit link function, with the corresponding expected probability as the only fixed covariate. The expected probabilities were first converted into odds and then log-transformed before entering the model. For LOS, Poisson regression was performed using a log link function with the log-transformed expected LOS as the only fixed covariate. For coded patient experience responses, an ordered logistic regression was performed using a cumulative logit link function (no fixed effects were added).
MM Model-based Metrics. Each fitted MM model produces a predicted random effect for each provider. The provider-specific random effects can be interpreted as the unobserved influence of each provider on the outcome after controlling for any fixed effect included in the model. Therefore, the provider-specific random effects were used to evaluate the relative provider performance, which is analogous to the individual provider-level metrics used in the PAPR method.
Measuring provider performance using a MM model is more flexible and robust to outliers compared to the standard approach using OE indices or simple averages. First, although not investigated here, the effect of patient-, visit-, provider-, and/or temporal-level covariates can be controlled when evaluating provider performance. For example, a patient’s socioeconomic status, a provider’s workload, and seasonal factors can be added to the MM model. These external factors are not accounted for in OE indices.
Another advantage of using predicted random effects is the concept of “shrinkage.” The process of estimating random effects inherently accounts for small sample sizes (when providers do not treat a large enough sample of patients) and/or when there is a large ratio of patient variance to provider variance (for instance, when patient outcome variability is much higher compared to provider performance variability). In both cases, the estimation of the random effect is pulled ever closer to 0, signaling that the provider performance is closer to the population average. See Henderson15 and Mood16 for further details.
In contrast, OE indices can result in unreliable estimates when a provider has not cared for many patients. This is especially prevalent when the outcome is binary with a low probability of occurring, such as mortality. Indeed, provider-level mortality OE indices are routinely 0 when the patient counts are low, which skews performance rankings unfairly. Finally, OE indices also ignore the magnitude of the variance of an outcome between providers and patients, which can be large.
Comparison Methodology
In this study, we seek to compare the 2 methods of attribution, PAPR and PAMM, to determine whether there are meaningful differences between them when measuring provider performance. Using retrospective data described in the next section, each attribution method was used independently to derive provider-level metrics. To assess relative performance, percentiles were assigned to each provider based on their metric values so that, in the end, there were 2 percentile ranks for each provider for each metric.
Using these paired percentiles, we derived the following measures of concordance, similar to Herzke, Michtalik3: (1) the percent concordance measure—defined as the number of providers who landed in the top half (greater than the median) or bottom half under both attribution models—divided by the total number of providers; (2) the median of the absolute difference in percentiles under both attribution models; and (3) the Pearson correlation coefficient of the paired provider ranks. The first measure is a global measure of concordance between the 2 approaches and would be expected to be 50% by chance. The second measure gauges how an individual provider’s rank is affected by the change in attribution methodologies. The third measure is a statistical measure of linear correlation of the paired percentiles and was not included in the Herzke, Michtalik3 study.
All statistical analyses were performed on SAS (version 9.4; Cary, NC) and the MM models were fitted using PROC GLIMMIX with the EFFECT statement. The Banner Health Institutional Review Board approved this study.
Results
Descriptive Statistics
A total of
Multi-Membership Model Results
Table 3 displays the results after independently fitting MM models to each of the 3 clinical outcomes. Along with a marginal intercept, the only covariate in each model was the corresponding expected value after a transformation. This was added to use the same information that is typically used in OE indices, therefore allowing for a proper comparison between the 2 attribution methods. The provider-level variance represents the between-provider variation and measures the amount of influence providers have on the corresponding outcome after controlling for any covariates in the model. A provider-level variance of 0 would indicate that providers do not have any influence on the outcome. While the mortality and readmission model results can be compared to each other, the LOS model cannot given its different scale and transformation altogether.
The results in Table 3 suggest that each expected value covariate is highly correlated with its corresponding outcome, which is the anticipated conclusion given that they are constructed in this fashion. The estimated provider-level variances indicate that, after including an expected value in the model, providers have less of an influence on a patient’s LOS and likelihood of being readmitted. On the other hand, the results suggest that providers have much more influence on the likelihood of a patient dying in the hospital, even after controlling for an expected mortality covariate.
Table 4 shows the results after independently fitting MM-ordered logistic models to each of the 3 survey questions. The similar provider-level variances suggest that providers have the same influence on the patient’s perception of the quality of their interactions with the doctor (Doctor), the quality of the care they received (Care), and their likelihood to recommend a friend or family member to the hospital (NPS).
Comparison Results Between Both Attribution Methods
Table 5 compares the 2 attribution methods when ranking providers based on their performance on each outcome measure. The comparison metrics gauge how well the 2 methods agree overall (percent concordance), agree at the provider level (absolute percentile difference and interquartile range [IQR]), and how the paired percentiles linearly correlate to each other (Pearson correlation coefficient).
LOS, by a small margin, had the lowest concordance of clinical outcomes (72.1%), followed by mortality (75.9%) and readmissions (82.1%). Generally, the survey scores had higher percent concordance than the clinical outcome measures, with Doctor at 84.1%, Care at 85.9%, and NPS having the highest percent concordance at 86.6%. Given that by chance the percent concordance is expected to be 50%, there was notable discordance, especially with the clinical outcome measures. Using LOS performance as an example, one attribution methodology would rank a provider in the top half or bottom half, while the other attribution methodology would rank the same provider exactly the opposite way about 28% of the time.
The median absolute percentile difference between the 2 methods was more modest (between 7 and 15). Still, there were some providers whose performance ranking was heavily impacted by the attribution methodology that was used. This was especially true when evaluating performance for certain clinical measures, where the attribution method that was used could change the provider performance percentile by up to 90 levels.
The paired percentiles were positively correlated when ranking performance using any of the 6 measures. This suggests that both methodologies assess performance generally in the same direction, irrespective of the methodology and measure. We did not investigate more complex correlation measures and left this for future research.
It should be noted that ties occurred much more frequently with the PAPR method than when using PAMM and therefore required tie-breaking rules to be designed. Given the nature of OE indices, PAPR methodology is especially sensitive to ties whenever the measure includes counting the number of events (for example, mortality and readmissions) and whenever there are many providers with very few attributed patients. On the other hand, using the PAMM method is much more robust against ties given that the summation of all the weighted attributed outcomes will rarely result in ties, even with a nominal set of providers.
Discussion
In this study, the PAMM methodology was introduced and was used to assess relative provider performance on 3 clinical outcome measures and 3 patient survey scores. The new approach aims to distribute each outcome among all providers who provided care for a patient in an inpatient setting. Clinical notes were used to account for patient-to-provider interactions, and fitted MM statistical models were used to compute the effects that each provider had on each outcome. The provider effect was introduced as a random effect, and the set of predicted random effects was used to rank the performance of each provider.
The PAMM approach was compared to the more traditional methodology, PAPR, where each patient is attributed to only 1 provider: the discharging physician in this study. Using this approach, OE indices of clinical outcomes and averages of survey scores were used to rank the performance of each provider. This approach resulted in many ties, which were broken based on the number of hospitalizations, although other tie-breaking methods may be used in practice.
Both methodologies showed modest concordance with each other for the clinical outcomes, but higher concordance for the patient survey scores. This was also true when using the Pearson correlation coefficient to assess agreement. The 1 outcome measure that showed the least concordance and least linear correlation between methods was LOS, which would suggest that LOS performance is more sensitive to the attribution methodology that is used. However, it was the least concordant by a small margin.
Furthermore, although the medians of the absolute percentile differences were small, there were some providers who had large deviations, suggesting that some providers would move from being shown as high-performers to low-performers and vice versa based on the chosen attribution method. We investigated examples of this and determined that the root cause was the difference in effective sample sizes for a provider. For the PAPR method, the effective sample size is simply the number of hospitalizations attributed to the provider. For the PAMM method, the effective sample size is the sum of all non-zero weights across all hospitalizations where the provider cared for a patient. By and large, the PAMM methodology provides more information of the provider effect on an outcome than the PAPR approach because every provider-patient interaction is considered. For example, providers who do not routinely discharge patients, but often care for patients, will have rankings that differ dramatically between the 2 methods.
The PAMM methodology has many statistical advantages that were not fully utilized in this comparative study. For example, we did not include any covariates in the MM models except for the expected value of the outcome, when it was available. Still, it is known that other covariates can impact an outcome as well, such as the patient’s age, socioeconomic indicators, existing chronic conditions, and severity of hospitalization, which can be added to the MM models as fixed effects. In this way, the PAMM approach can control for these other covariates, which are typically outside of the control of providers but typically ignored using OE indices. Therefore, using the PAMM approach would provide a fairer comparison of provider performance.
Using the PAMM method, most providers had a large sample size to assess their performance once all the weighted interactions were included. Still, there were a few who did not care for many patients for a variety of reasons. In these scenarios, MM models “borrow” strength from other providers to produce a more robust predicted provider effect by using a weighted average between the overall population trend and the specific provider outcomes (see Rao and Molina17). As a result, PAMM is a more suitable approach when the sample sizes of patients attributed to providers can be small.
One of the most interesting findings of this study was the relative size of the provider-level variance to the size of the fixed effect in each model (Table 3). Except for mortality, these variances suggest that there is a small difference in performance from one provider to another. However, these should be interpreted as the variance when only 1 provider is involved in the care of a patient. When multiple providers are involved, using basic statistical theory, the overall provider-level variance will be σγ2 ∑wij2 (see Equation 2). For example, the estimated variance among providers for LOS was 0.03 (on a log scale), but, using the scenario in the Figure, the overall provider-level variance for this hospitalization will be 0.03 (0.3752 + 0.1252 + 0.52) = 0.012. Hence, the combined effect of providers on LOS is less than would be expected. Indeed, as more providers are involved with a patient’s care, the more their combined influence on an outcome is diluted.
In this study, the PAMM approach placed an equal weight on all provider-patient interactions via clinical note authorship, but that may not be optimal in some settings. For example, it may make more sense to set a higher weight on the provider who admitted or discharged the patient while placing less (or 0) weight on all other interactions. In the extreme, if the full weight were placed on 1 provider interaction (eg, during discharge, then the MM model would be reduced to a one-way random effects model. The flexibility of weighting interactions is a feature of the PAMM approach, but any weighting framework must be transparent to the providers before implementation.
Conclusion
This study demonstrates that the PAMM approach is a feasible option within a large health care organization. For P4P programs to be successful, providers must be able to trust that their performance will be fairly assessed and that all provider-patient interactions are captured to provide a full comparison amongst their peers. The PAMM methodology is one solution to spread the positive (and negative) outcomes across all providers who cared for a patient and therefore, if implemented, would add trust and fairness when measuring and assessing provider performance.
Acknowledgments: The authors thank Barrie Bradley for his support in the initial stages of this research and Dr. Syed Ismail Jafri for his help and support on the standard approaches of assessing and measuring provider performances.
Corresponding author: Rachel Ginn, MS, Banner Health Corporation, 2901 N. Central Ave., Phoenix, AZ 85012; [email protected].
Financial disclosures: None.
1. Abduljawad A, Al-Assaf AF. Incentives for better performance in health care. Sultan Qaboos Univ Med J. 2011;11:201-206.
2. Milstein R, Schreyoegg J. Pay for performance in the inpatient sector: a review of 34 P4P programs in 14 OECD countries. Health Policy. 2016;120:1125-1140.
3. Herzke CA, Michtalik HJ, Durkin N, et al. A method for attributing patient-level metrics to rotating providers in an inpatient setting. J Hosp Med. 2018;13:470-475.
4. Batbaatar E, Dorjdagva J, Luvsannyam A, Savino MM, Amenta P. Determinants of patient satisfaction: a systematic review. Perspect Public Health. 2017;137:89-101.
5. Ballou D, Sanders W, Wright P. Controlling for student background in value-added assessment of teachers. J Educ Behav Stat. 2004;29:37-65.
6. Hill PW, Goldstein H. Multilevel modeling of educational data with cross-classification and missing identification for units. J Educ Behav Stat. 1998;23:117-128.
7. Rasbash J, Browne WJ. Handbook of Multilevel Analysis. Springer; 2007.
8. Brown WJ, Goldstein H, Rasbash J. Multiple membership multiple classification (MMMC) models. Statistical Modeling. 2001;1:103-124.
9. Sanders WL, Horn SP. The Tennessee Value-Added Assessment System (TVAAS)—mixed-model methodology in educational assessment. J Pers Eval Educ. 1994;8:299-311.
10. Kroch EA, Duan M. CareScience Risk Assessment Model: Hospital Performance Measurement. Premier, Inc., 2008. http://www.ahrq.gov/qual/mortality/KrochRisk.htm
11. Schumacher DJ, Wu DTY, Meganathan K, et al. A feasibility study to attribute patients to primary interns on inpatient ward teams using electronic health record data. Acad Med. 2019;94:1376-1383.
12. Simoes J, Krumholz HM, Lin Z. Hospital-level 30-day risk-standardized readmission measure. Centers for Medicare & Medicaid Services, 2018. https://www.cms.gov/Medicare/Quality-Initiatives-Patient-Assessment-Instruments/HospitalQualityInits/Downloads/Hospital-Wide-All-Cause-Readmission-Updates.zip
13. Krol MW, de Boer D, Delnoij DM, Rademakers JJDJM. The Net Promoter Score: an asset to patient experience surveys? Health Expect. 2015;18:3099-3109.
14. Doyle C, Lennox L, Bell D. A systematic review of evidence on the links between patient experience and clinical safety and effectiveness. BMJ Open. 2013;3:e001570.
15. Henderson CR. Sire evaluation and genetic trends. J Anim Sci. 1973;1973:10-41.
16. Mood AM. Introduction to the Theory of Statistics. McGraw-Hill; 1950:xiii, 433-xiii.
17. Rao JNK, Molina I. Small Area Estimation. Wiley; 2015.
1. Abduljawad A, Al-Assaf AF. Incentives for better performance in health care. Sultan Qaboos Univ Med J. 2011;11:201-206.
2. Milstein R, Schreyoegg J. Pay for performance in the inpatient sector: a review of 34 P4P programs in 14 OECD countries. Health Policy. 2016;120:1125-1140.
3. Herzke CA, Michtalik HJ, Durkin N, et al. A method for attributing patient-level metrics to rotating providers in an inpatient setting. J Hosp Med. 2018;13:470-475.
4. Batbaatar E, Dorjdagva J, Luvsannyam A, Savino MM, Amenta P. Determinants of patient satisfaction: a systematic review. Perspect Public Health. 2017;137:89-101.
5. Ballou D, Sanders W, Wright P. Controlling for student background in value-added assessment of teachers. J Educ Behav Stat. 2004;29:37-65.
6. Hill PW, Goldstein H. Multilevel modeling of educational data with cross-classification and missing identification for units. J Educ Behav Stat. 1998;23:117-128.
7. Rasbash J, Browne WJ. Handbook of Multilevel Analysis. Springer; 2007.
8. Brown WJ, Goldstein H, Rasbash J. Multiple membership multiple classification (MMMC) models. Statistical Modeling. 2001;1:103-124.
9. Sanders WL, Horn SP. The Tennessee Value-Added Assessment System (TVAAS)—mixed-model methodology in educational assessment. J Pers Eval Educ. 1994;8:299-311.
10. Kroch EA, Duan M. CareScience Risk Assessment Model: Hospital Performance Measurement. Premier, Inc., 2008. http://www.ahrq.gov/qual/mortality/KrochRisk.htm
11. Schumacher DJ, Wu DTY, Meganathan K, et al. A feasibility study to attribute patients to primary interns on inpatient ward teams using electronic health record data. Acad Med. 2019;94:1376-1383.
12. Simoes J, Krumholz HM, Lin Z. Hospital-level 30-day risk-standardized readmission measure. Centers for Medicare & Medicaid Services, 2018. https://www.cms.gov/Medicare/Quality-Initiatives-Patient-Assessment-Instruments/HospitalQualityInits/Downloads/Hospital-Wide-All-Cause-Readmission-Updates.zip
13. Krol MW, de Boer D, Delnoij DM, Rademakers JJDJM. The Net Promoter Score: an asset to patient experience surveys? Health Expect. 2015;18:3099-3109.
14. Doyle C, Lennox L, Bell D. A systematic review of evidence on the links between patient experience and clinical safety and effectiveness. BMJ Open. 2013;3:e001570.
15. Henderson CR. Sire evaluation and genetic trends. J Anim Sci. 1973;1973:10-41.
16. Mood AM. Introduction to the Theory of Statistics. McGraw-Hill; 1950:xiii, 433-xiii.
17. Rao JNK, Molina I. Small Area Estimation. Wiley; 2015.
Noninvasive Ventilation Use Among Medicare Beneficiaries at the End of Life
Study Overview
Objective. To examine the trend of noninvasive and invasive mechanical ventilation at the end of life from 2000 to 2017.
Design. Observational population-based cohort study.
Setting and participants. The study was a population-based cohort study to examine the use of noninvasive and invasive mechanical ventilation among decedents. The study included a random 20% sample of Medicare beneficiaries older than 65 years who were hospitalized in the last 30 days of life and died between January 1, 2000, and December 31, 2017, except for the period October 1, 2015, to December 31, 2015, when the transition from International Classification of Diseases, Ninth Revision (ICD-9) to ICD-10 occurred. Beneficiaries with the primary admitting diagnosis of cardiac arrest or with preexisting tracheostomy were excluded because of expected requirements for ventilatory support. The sample included a total of 2,470,735 Medicare beneficiaries; mean age was 82.2 years, and 54.8% were female. Primary admitting diagnosis codes were used to identify 3 subcohorts: congestive heart failure, chronic obstructive pulmonary disease, and cancer; a fourth subcohort of dementia was identified using the primary admitting diagnosis code or the first 9 secondary diagnosis codes.
Main outcome measures. The study used procedure codes to identify the use of noninvasive ventilation, invasive mechanical ventilation, or none among decedents who were hospitalized in the last 30 days of life. Descriptive statistics to characterize variables by year of hospitalization and ventilatory support were calculated, and the rates of noninvasive and invasive mechanical ventilation use were tabulated. Other outcomes of interest include site of death (in-hospital death), hospice enrollment at death, and hospice enrollment in the last 3 days of life as measures of end-of- life care use. Multivariable logistic regressions were used to examine noninvasive and invasive mechanical ventilation use among decedents, and time trends were examined, with the pattern of use in year 2000 as reference. Subgroup analysis with the subcohort of patients with different diagnoses were conducted to examine trends.
Main results. From 2000 to 2017, 16.3% of decedents had invasive mechanical ventilation, 3.7% had noninvasive ventilation, and 1.0% had both noninvasive and invasive ventilation during their hospital stay. Compared to the reference year 2000, there was a 9-fold increase in noninvasive ventilation use, from 0.8% to 7.1% in 2017, and invasive mechanical ventilation use also increased slightly, from 15.0% to 18.5%. Compared to year 2000, decedents were 2.63 times and 1.04 times (adjusted odds ratio [OR]) more likely to receive noninvasive ventilation and invasive mechanical ventilation, respectively, in 2005, 7.87 times and 1.39 times more likely in 2011, and 11.84 times and 1.63 times more likely in 2017.
Subgroup analysis showed that for congestive heart failure and chronic obstructive pulmonary disease, the increase in noninvasive ventilation use mirrored the trend observed for the overall population, but the use of invasive mechanical ventilation did not increase from 2000 to 2017, with a rate of use of 11.1% versus 7.8% (adjusted OR, 1.07; 95% confidence interval [CI], 0.95-1.19) for congestive heart failure and 17.4% vs 13.2% (OR 1.03, 95% CI, 0.88-1.21) for chronic obstructive pulmonary disease. For the cancer and dementia subgroups, the increase in noninvasive ventilation use from 2000 to 2017 was accompanied by an increase in the use of invasive mechanical ventilation, with a rate of 6.2% versus 7.4% (OR, 1.40; 95% CI, 1.26-1.55) for decedents with cancer and a rate of 5.7% versus 6.2% (OR, 1.28; 95% CI, 1.17-1.41) for decedents with dementia. For other measures of end-of-life care, noninvasive ventilation use when compared to invasive mechanical ventilation use was associated with lower rates of in-hospital (acute care) deaths (50.3% vs 76.7%), hospice enrollment in the last 3 days of life (late hospice enrollment; 57.7% vs 63.0%), and higher rates of hospice enrollment at death (41.3% vs 20.0%).
Conclusion. There was an increase in the use of noninvasive ventilation from 2000 through 2017 among Medicare beneficiaries who died. The findings also suggest that the use of invasive mechanical ventilation did not increase among decedents with congestive heart failure and chronic obstructive pulmonary disease but increased among decedents with cancer and dementia.
Commentary
Noninvasive ventilation offers an alternative to invasive mechanical ventilation for providing ventilatory support for respiratory failure, and may offer benefits as it could avert adverse effects associated with invasive mechanical ventilation, particularly in the management of respiratory failure due to congestive heart failure and chronic obstructive pulmonary disease.1 There is evidence for potential benefits of use of noninvasive ventilation in other clinical scenarios, such as pneumonia in older adults with comorbidities, though its clinical utility is not as well established for other diseases.2
As noninvasive ventilation is introduced into clinical practice, it is not surprising that over the period of the study (2000 to 2017) that its use increased substantially. Advance directives that involve discussion of life-sustaining treatments, including in scenarios with respiratory failure, may also result in physician orders that specify whether an individual desires invasive mechanical ventilation versus other medical treatments, including noninvasive ventilation.3,4 By examining the temporal trends of use of noninvasive and invasive ventilation, this study reveals that invasive mechanical ventilation use among decedents with dementia and cancer has increased, despite increases in the use of noninvasive ventilation. It is important to understand further what would explain these temporal trends and whether the use of noninvasive and also invasive mechanical ventilation at the end of life represents appropriate care with clear goals or whether it may represent overuse. It is also less clear in the end-of-life care scenario what the goals of treatment with noninvasive ventilation would be, especially if it does not avert the use of invasive mechanical ventilation.
The study includes decedents only, thus limiting the ability to draw conclusions about clinically appropriate care.5 Further studies should examine a cohort of patients who have serious and life-threatening illness to examine the trends and potential effects of noninvasive ventilation on outcomes and utilization, as individuals who have improved and survived would not be included in this present decedent cohort.
Applications for Clinical Practice
This study highlights changes in the use of noninvasive and invasive ventilation over time and the different trends seen among subgroups with different diagnoses. For older adults with serious comorbid illness such as dementia, it is especially important to have discussions on advance directives so that care at the end of life is concordant with the patient’s wishes and that unnecessary, burdensome care can be averted. Further studies to understand and define the appropriate use of noninvasive and invasive mechanical ventilation for older adults with significant comorbidities who have serious, life-threatening illness are needed to ensure appropriate clinical treatment at the end of life.
–William W. Hung, MD, MPH
1. Lindenauer PK, Stefan MS, Shieh M et al. Outcomes associated with invasive and noninvasive ventilation a mong patients hospitalized with exacerbations of chronic obstructive pulmonary disease. JAMA Intern Med. 2014;174:1982-993.
2. Johnson CS, Frei CR, Metersky ML, et al. Non-invasive mechanical ventilation and mortality in elderly immunocompromised patients hospitalized with pneumonia: a retrospective cohort study. BMC Pulm Med. 2014;14:7. Published 2014 Jan 27. doi:10.1186/1471-2466-14-7
3. Lee R, Brumbeck L, Sathitratanacheewin S, et al. Association of physician orders for life-sustaining treatment with icu admission among patients hospitalized near the end of life. JAMA. 2020;323:950-60.
4. Bomba P, Kemp M, Black J. POLST: An improvement over traditional advance directives. Cleveland Clinic J Med. 2012;79:457-464.
5. Duncan I, Ahmed T, Dove H, Maxwell TL. Medicare cost at end of life. Am J Hosp Palliat Care. 2019;36:705-710.
Study Overview
Objective. To examine the trend of noninvasive and invasive mechanical ventilation at the end of life from 2000 to 2017.
Design. Observational population-based cohort study.
Setting and participants. The study was a population-based cohort study to examine the use of noninvasive and invasive mechanical ventilation among decedents. The study included a random 20% sample of Medicare beneficiaries older than 65 years who were hospitalized in the last 30 days of life and died between January 1, 2000, and December 31, 2017, except for the period October 1, 2015, to December 31, 2015, when the transition from International Classification of Diseases, Ninth Revision (ICD-9) to ICD-10 occurred. Beneficiaries with the primary admitting diagnosis of cardiac arrest or with preexisting tracheostomy were excluded because of expected requirements for ventilatory support. The sample included a total of 2,470,735 Medicare beneficiaries; mean age was 82.2 years, and 54.8% were female. Primary admitting diagnosis codes were used to identify 3 subcohorts: congestive heart failure, chronic obstructive pulmonary disease, and cancer; a fourth subcohort of dementia was identified using the primary admitting diagnosis code or the first 9 secondary diagnosis codes.
Main outcome measures. The study used procedure codes to identify the use of noninvasive ventilation, invasive mechanical ventilation, or none among decedents who were hospitalized in the last 30 days of life. Descriptive statistics to characterize variables by year of hospitalization and ventilatory support were calculated, and the rates of noninvasive and invasive mechanical ventilation use were tabulated. Other outcomes of interest include site of death (in-hospital death), hospice enrollment at death, and hospice enrollment in the last 3 days of life as measures of end-of- life care use. Multivariable logistic regressions were used to examine noninvasive and invasive mechanical ventilation use among decedents, and time trends were examined, with the pattern of use in year 2000 as reference. Subgroup analysis with the subcohort of patients with different diagnoses were conducted to examine trends.
Main results. From 2000 to 2017, 16.3% of decedents had invasive mechanical ventilation, 3.7% had noninvasive ventilation, and 1.0% had both noninvasive and invasive ventilation during their hospital stay. Compared to the reference year 2000, there was a 9-fold increase in noninvasive ventilation use, from 0.8% to 7.1% in 2017, and invasive mechanical ventilation use also increased slightly, from 15.0% to 18.5%. Compared to year 2000, decedents were 2.63 times and 1.04 times (adjusted odds ratio [OR]) more likely to receive noninvasive ventilation and invasive mechanical ventilation, respectively, in 2005, 7.87 times and 1.39 times more likely in 2011, and 11.84 times and 1.63 times more likely in 2017.
Subgroup analysis showed that for congestive heart failure and chronic obstructive pulmonary disease, the increase in noninvasive ventilation use mirrored the trend observed for the overall population, but the use of invasive mechanical ventilation did not increase from 2000 to 2017, with a rate of use of 11.1% versus 7.8% (adjusted OR, 1.07; 95% confidence interval [CI], 0.95-1.19) for congestive heart failure and 17.4% vs 13.2% (OR 1.03, 95% CI, 0.88-1.21) for chronic obstructive pulmonary disease. For the cancer and dementia subgroups, the increase in noninvasive ventilation use from 2000 to 2017 was accompanied by an increase in the use of invasive mechanical ventilation, with a rate of 6.2% versus 7.4% (OR, 1.40; 95% CI, 1.26-1.55) for decedents with cancer and a rate of 5.7% versus 6.2% (OR, 1.28; 95% CI, 1.17-1.41) for decedents with dementia. For other measures of end-of-life care, noninvasive ventilation use when compared to invasive mechanical ventilation use was associated with lower rates of in-hospital (acute care) deaths (50.3% vs 76.7%), hospice enrollment in the last 3 days of life (late hospice enrollment; 57.7% vs 63.0%), and higher rates of hospice enrollment at death (41.3% vs 20.0%).
Conclusion. There was an increase in the use of noninvasive ventilation from 2000 through 2017 among Medicare beneficiaries who died. The findings also suggest that the use of invasive mechanical ventilation did not increase among decedents with congestive heart failure and chronic obstructive pulmonary disease but increased among decedents with cancer and dementia.
Commentary
Noninvasive ventilation offers an alternative to invasive mechanical ventilation for providing ventilatory support for respiratory failure, and may offer benefits as it could avert adverse effects associated with invasive mechanical ventilation, particularly in the management of respiratory failure due to congestive heart failure and chronic obstructive pulmonary disease.1 There is evidence for potential benefits of use of noninvasive ventilation in other clinical scenarios, such as pneumonia in older adults with comorbidities, though its clinical utility is not as well established for other diseases.2
As noninvasive ventilation is introduced into clinical practice, it is not surprising that over the period of the study (2000 to 2017) that its use increased substantially. Advance directives that involve discussion of life-sustaining treatments, including in scenarios with respiratory failure, may also result in physician orders that specify whether an individual desires invasive mechanical ventilation versus other medical treatments, including noninvasive ventilation.3,4 By examining the temporal trends of use of noninvasive and invasive ventilation, this study reveals that invasive mechanical ventilation use among decedents with dementia and cancer has increased, despite increases in the use of noninvasive ventilation. It is important to understand further what would explain these temporal trends and whether the use of noninvasive and also invasive mechanical ventilation at the end of life represents appropriate care with clear goals or whether it may represent overuse. It is also less clear in the end-of-life care scenario what the goals of treatment with noninvasive ventilation would be, especially if it does not avert the use of invasive mechanical ventilation.
The study includes decedents only, thus limiting the ability to draw conclusions about clinically appropriate care.5 Further studies should examine a cohort of patients who have serious and life-threatening illness to examine the trends and potential effects of noninvasive ventilation on outcomes and utilization, as individuals who have improved and survived would not be included in this present decedent cohort.
Applications for Clinical Practice
This study highlights changes in the use of noninvasive and invasive ventilation over time and the different trends seen among subgroups with different diagnoses. For older adults with serious comorbid illness such as dementia, it is especially important to have discussions on advance directives so that care at the end of life is concordant with the patient’s wishes and that unnecessary, burdensome care can be averted. Further studies to understand and define the appropriate use of noninvasive and invasive mechanical ventilation for older adults with significant comorbidities who have serious, life-threatening illness are needed to ensure appropriate clinical treatment at the end of life.
–William W. Hung, MD, MPH
Study Overview
Objective. To examine the trend of noninvasive and invasive mechanical ventilation at the end of life from 2000 to 2017.
Design. Observational population-based cohort study.
Setting and participants. The study was a population-based cohort study to examine the use of noninvasive and invasive mechanical ventilation among decedents. The study included a random 20% sample of Medicare beneficiaries older than 65 years who were hospitalized in the last 30 days of life and died between January 1, 2000, and December 31, 2017, except for the period October 1, 2015, to December 31, 2015, when the transition from International Classification of Diseases, Ninth Revision (ICD-9) to ICD-10 occurred. Beneficiaries with the primary admitting diagnosis of cardiac arrest or with preexisting tracheostomy were excluded because of expected requirements for ventilatory support. The sample included a total of 2,470,735 Medicare beneficiaries; mean age was 82.2 years, and 54.8% were female. Primary admitting diagnosis codes were used to identify 3 subcohorts: congestive heart failure, chronic obstructive pulmonary disease, and cancer; a fourth subcohort of dementia was identified using the primary admitting diagnosis code or the first 9 secondary diagnosis codes.
Main outcome measures. The study used procedure codes to identify the use of noninvasive ventilation, invasive mechanical ventilation, or none among decedents who were hospitalized in the last 30 days of life. Descriptive statistics to characterize variables by year of hospitalization and ventilatory support were calculated, and the rates of noninvasive and invasive mechanical ventilation use were tabulated. Other outcomes of interest include site of death (in-hospital death), hospice enrollment at death, and hospice enrollment in the last 3 days of life as measures of end-of- life care use. Multivariable logistic regressions were used to examine noninvasive and invasive mechanical ventilation use among decedents, and time trends were examined, with the pattern of use in year 2000 as reference. Subgroup analysis with the subcohort of patients with different diagnoses were conducted to examine trends.
Main results. From 2000 to 2017, 16.3% of decedents had invasive mechanical ventilation, 3.7% had noninvasive ventilation, and 1.0% had both noninvasive and invasive ventilation during their hospital stay. Compared to the reference year 2000, there was a 9-fold increase in noninvasive ventilation use, from 0.8% to 7.1% in 2017, and invasive mechanical ventilation use also increased slightly, from 15.0% to 18.5%. Compared to year 2000, decedents were 2.63 times and 1.04 times (adjusted odds ratio [OR]) more likely to receive noninvasive ventilation and invasive mechanical ventilation, respectively, in 2005, 7.87 times and 1.39 times more likely in 2011, and 11.84 times and 1.63 times more likely in 2017.
Subgroup analysis showed that for congestive heart failure and chronic obstructive pulmonary disease, the increase in noninvasive ventilation use mirrored the trend observed for the overall population, but the use of invasive mechanical ventilation did not increase from 2000 to 2017, with a rate of use of 11.1% versus 7.8% (adjusted OR, 1.07; 95% confidence interval [CI], 0.95-1.19) for congestive heart failure and 17.4% vs 13.2% (OR 1.03, 95% CI, 0.88-1.21) for chronic obstructive pulmonary disease. For the cancer and dementia subgroups, the increase in noninvasive ventilation use from 2000 to 2017 was accompanied by an increase in the use of invasive mechanical ventilation, with a rate of 6.2% versus 7.4% (OR, 1.40; 95% CI, 1.26-1.55) for decedents with cancer and a rate of 5.7% versus 6.2% (OR, 1.28; 95% CI, 1.17-1.41) for decedents with dementia. For other measures of end-of-life care, noninvasive ventilation use when compared to invasive mechanical ventilation use was associated with lower rates of in-hospital (acute care) deaths (50.3% vs 76.7%), hospice enrollment in the last 3 days of life (late hospice enrollment; 57.7% vs 63.0%), and higher rates of hospice enrollment at death (41.3% vs 20.0%).
Conclusion. There was an increase in the use of noninvasive ventilation from 2000 through 2017 among Medicare beneficiaries who died. The findings also suggest that the use of invasive mechanical ventilation did not increase among decedents with congestive heart failure and chronic obstructive pulmonary disease but increased among decedents with cancer and dementia.
Commentary
Noninvasive ventilation offers an alternative to invasive mechanical ventilation for providing ventilatory support for respiratory failure, and may offer benefits as it could avert adverse effects associated with invasive mechanical ventilation, particularly in the management of respiratory failure due to congestive heart failure and chronic obstructive pulmonary disease.1 There is evidence for potential benefits of use of noninvasive ventilation in other clinical scenarios, such as pneumonia in older adults with comorbidities, though its clinical utility is not as well established for other diseases.2
As noninvasive ventilation is introduced into clinical practice, it is not surprising that over the period of the study (2000 to 2017) that its use increased substantially. Advance directives that involve discussion of life-sustaining treatments, including in scenarios with respiratory failure, may also result in physician orders that specify whether an individual desires invasive mechanical ventilation versus other medical treatments, including noninvasive ventilation.3,4 By examining the temporal trends of use of noninvasive and invasive ventilation, this study reveals that invasive mechanical ventilation use among decedents with dementia and cancer has increased, despite increases in the use of noninvasive ventilation. It is important to understand further what would explain these temporal trends and whether the use of noninvasive and also invasive mechanical ventilation at the end of life represents appropriate care with clear goals or whether it may represent overuse. It is also less clear in the end-of-life care scenario what the goals of treatment with noninvasive ventilation would be, especially if it does not avert the use of invasive mechanical ventilation.
The study includes decedents only, thus limiting the ability to draw conclusions about clinically appropriate care.5 Further studies should examine a cohort of patients who have serious and life-threatening illness to examine the trends and potential effects of noninvasive ventilation on outcomes and utilization, as individuals who have improved and survived would not be included in this present decedent cohort.
Applications for Clinical Practice
This study highlights changes in the use of noninvasive and invasive ventilation over time and the different trends seen among subgroups with different diagnoses. For older adults with serious comorbid illness such as dementia, it is especially important to have discussions on advance directives so that care at the end of life is concordant with the patient’s wishes and that unnecessary, burdensome care can be averted. Further studies to understand and define the appropriate use of noninvasive and invasive mechanical ventilation for older adults with significant comorbidities who have serious, life-threatening illness are needed to ensure appropriate clinical treatment at the end of life.
–William W. Hung, MD, MPH
1. Lindenauer PK, Stefan MS, Shieh M et al. Outcomes associated with invasive and noninvasive ventilation a mong patients hospitalized with exacerbations of chronic obstructive pulmonary disease. JAMA Intern Med. 2014;174:1982-993.
2. Johnson CS, Frei CR, Metersky ML, et al. Non-invasive mechanical ventilation and mortality in elderly immunocompromised patients hospitalized with pneumonia: a retrospective cohort study. BMC Pulm Med. 2014;14:7. Published 2014 Jan 27. doi:10.1186/1471-2466-14-7
3. Lee R, Brumbeck L, Sathitratanacheewin S, et al. Association of physician orders for life-sustaining treatment with icu admission among patients hospitalized near the end of life. JAMA. 2020;323:950-60.
4. Bomba P, Kemp M, Black J. POLST: An improvement over traditional advance directives. Cleveland Clinic J Med. 2012;79:457-464.
5. Duncan I, Ahmed T, Dove H, Maxwell TL. Medicare cost at end of life. Am J Hosp Palliat Care. 2019;36:705-710.
1. Lindenauer PK, Stefan MS, Shieh M et al. Outcomes associated with invasive and noninvasive ventilation a mong patients hospitalized with exacerbations of chronic obstructive pulmonary disease. JAMA Intern Med. 2014;174:1982-993.
2. Johnson CS, Frei CR, Metersky ML, et al. Non-invasive mechanical ventilation and mortality in elderly immunocompromised patients hospitalized with pneumonia: a retrospective cohort study. BMC Pulm Med. 2014;14:7. Published 2014 Jan 27. doi:10.1186/1471-2466-14-7
3. Lee R, Brumbeck L, Sathitratanacheewin S, et al. Association of physician orders for life-sustaining treatment with icu admission among patients hospitalized near the end of life. JAMA. 2020;323:950-60.
4. Bomba P, Kemp M, Black J. POLST: An improvement over traditional advance directives. Cleveland Clinic J Med. 2012;79:457-464.
5. Duncan I, Ahmed T, Dove H, Maxwell TL. Medicare cost at end of life. Am J Hosp Palliat Care. 2019;36:705-710.
Physician-Driven Discretionary Utilization: Measuring Overuse and Choosing Wisely
Overutilization and low-value care are important clinical and policy problems. Their measurement is challenging because it requires detailed clinical information. Additionally, there are inherent difficulties in identifying discretionary services likely to be inappropriate or low-value and demonstrating that certain services produce little/no health benefit. Quantifying “ideal” expected testing rates—ones that would reflect minimization of inappropriate/low-value care without excluding essential, high-yield diagnostic services—presents additional challenges. Consequently, of 521 unique measures specified by national measurement programs and professional guidelines, 91.6% targeted underuse, while only 6.5% targeted overuse.1
The potential for unintended consequences of implementing measures to eliminate overuse are a barrier to incorporating such measures into practice.2 For example, measuring, reporting, and penalizing overuse of inappropriate bone scanning may lead to underuse in patients for whom scanning is crucial.2 Most overuse measures based on inappropriate or low-value indications relate to imaging and medications.1 However, there is increasing interest in overutilization measures based on a broad set of health services. Identifying low-value testing or treatments often requires a substantial degree of clinical detail to avoid the damaging inclusion of beneficial services, which may lead to unintended negative outcomes, creating skepticism among clinicians. Ultimately, getting measurement of low-value care wrong would undermine adoption of interventions to reduce overuse.
To reduce low-value care through expansive measures of provider ordering behavior,3 Ellenbogen et al4 derived a novel index to identify hospitals with high rates of low-yield diagnostic testing. This index is based on the concept that, in the presence of nonspecific, symptom-based principal diagnoses, a substantial proportion of (apparently) non-diagnostic related studies were probably ordered despite a low pretest probability of serious disease. Since such symptom-based diagnoses reflect the absence of a more specific diagnosis, the examinations observed are markers of physician-driven decisions leading to discretionary utilization likely to be of low-value to patients. This study fills a critical gap in dual measures of appropriateness and yield, rather than simply utilization, to advance the Choosing Wisely campaign.3
Advantages of this overuse index include its derivation from administrative data, obviating the need for electronic health records, and incorporation of diagnostic yield at the inpatient-encounter level. One study selected procedures identifiable solely with claims from a set deemed overused by professional/consumer groups.5 However, the yield of physician decisions in specific cases was not measured. In contrast, this novel index is derived from an assessment of diagnostic yield.4 Although test results are not known with certainty, the absence of a specific discharge diagnosis serves as a test result proxy. Measurement of diagnostic examination yield at the patient-level (aggregated to the hospital-level) may be applicable across hospitals with varied patient populations, which include large differences in patient and/or family preferences to seek medical attention and engage in shared decision-making. The role that patient preferences play in decisions creates a limitation in this index—while decisions for the candidate diagnostic tests are physician driven, patient demand may be a confounding factor. This index cannot therefore be considered purely a measure of physician-induced intensity of diagnostic services. Patient-reported data would enhance future analyses by more fully capturing all dimensions of care necessary to identify low-value services. Subjective outcomes are critical in completely measuring the aggregate benefits of tests and interventions judged low-value based on objective metrics. Such data would also aid in quantifying the relative contributions of patient and physician preferences in driving discretionary utilization.
Finally, the derived index is restricted to diagnostic decision-making and may not be applicable to treatment-related practice patterns. However, the literature suggests strong correlations between diagnostic and therapeutic intensity. Application of this novel index will play an important role in reducing low-value discretionary utilization.
1. Newton EH, Zazzera EA, Van Moorsel G, Sirovich BE. Undermeasuring overuse--an examination of national clinical performance measures. JAMA Intern Med. 2015;175(10):1709-1711. https://doi.org/10.1001/jamainternmed.2015.4025
2. Mathias JS, Baker DW. Developing quality measures to address overuse. JAMA. 2013;309(18):1897-1898. https://doi.org/10.1001/jama.2013.3588
3. Bhatia RS, Levinson W, Shortt S, et al. Measuring the effect of Choosing Wisely: an integrated framework to assess campaign impact on low-value care. BMJ Qual Saf. 2015;24(8):523-531. https://doi.org/10.1136/bmjqs-2015-004070
4. Ellenbogen MI, Prichett L, Johnson PT, Brotman DJ. Development of a simple index to measure overuse of diagnostic testing at the hospital level using administrative data. J Hosp Med. 2021;16:xxx-xxx. https://doi.org/10.12788/jhm.3547
5. Segal JB, Bridges JF, Chang HY, et al. Identifying possible indicators of systematic overuse of health care procedures with claims data. Med Care. 2014;52(2):157-163. https://doi.org/10.1097/MLR.0000000000000052
Overutilization and low-value care are important clinical and policy problems. Their measurement is challenging because it requires detailed clinical information. Additionally, there are inherent difficulties in identifying discretionary services likely to be inappropriate or low-value and demonstrating that certain services produce little/no health benefit. Quantifying “ideal” expected testing rates—ones that would reflect minimization of inappropriate/low-value care without excluding essential, high-yield diagnostic services—presents additional challenges. Consequently, of 521 unique measures specified by national measurement programs and professional guidelines, 91.6% targeted underuse, while only 6.5% targeted overuse.1
The potential for unintended consequences of implementing measures to eliminate overuse are a barrier to incorporating such measures into practice.2 For example, measuring, reporting, and penalizing overuse of inappropriate bone scanning may lead to underuse in patients for whom scanning is crucial.2 Most overuse measures based on inappropriate or low-value indications relate to imaging and medications.1 However, there is increasing interest in overutilization measures based on a broad set of health services. Identifying low-value testing or treatments often requires a substantial degree of clinical detail to avoid the damaging inclusion of beneficial services, which may lead to unintended negative outcomes, creating skepticism among clinicians. Ultimately, getting measurement of low-value care wrong would undermine adoption of interventions to reduce overuse.
To reduce low-value care through expansive measures of provider ordering behavior,3 Ellenbogen et al4 derived a novel index to identify hospitals with high rates of low-yield diagnostic testing. This index is based on the concept that, in the presence of nonspecific, symptom-based principal diagnoses, a substantial proportion of (apparently) non-diagnostic related studies were probably ordered despite a low pretest probability of serious disease. Since such symptom-based diagnoses reflect the absence of a more specific diagnosis, the examinations observed are markers of physician-driven decisions leading to discretionary utilization likely to be of low-value to patients. This study fills a critical gap in dual measures of appropriateness and yield, rather than simply utilization, to advance the Choosing Wisely campaign.3
Advantages of this overuse index include its derivation from administrative data, obviating the need for electronic health records, and incorporation of diagnostic yield at the inpatient-encounter level. One study selected procedures identifiable solely with claims from a set deemed overused by professional/consumer groups.5 However, the yield of physician decisions in specific cases was not measured. In contrast, this novel index is derived from an assessment of diagnostic yield.4 Although test results are not known with certainty, the absence of a specific discharge diagnosis serves as a test result proxy. Measurement of diagnostic examination yield at the patient-level (aggregated to the hospital-level) may be applicable across hospitals with varied patient populations, which include large differences in patient and/or family preferences to seek medical attention and engage in shared decision-making. The role that patient preferences play in decisions creates a limitation in this index—while decisions for the candidate diagnostic tests are physician driven, patient demand may be a confounding factor. This index cannot therefore be considered purely a measure of physician-induced intensity of diagnostic services. Patient-reported data would enhance future analyses by more fully capturing all dimensions of care necessary to identify low-value services. Subjective outcomes are critical in completely measuring the aggregate benefits of tests and interventions judged low-value based on objective metrics. Such data would also aid in quantifying the relative contributions of patient and physician preferences in driving discretionary utilization.
Finally, the derived index is restricted to diagnostic decision-making and may not be applicable to treatment-related practice patterns. However, the literature suggests strong correlations between diagnostic and therapeutic intensity. Application of this novel index will play an important role in reducing low-value discretionary utilization.
Overutilization and low-value care are important clinical and policy problems. Their measurement is challenging because it requires detailed clinical information. Additionally, there are inherent difficulties in identifying discretionary services likely to be inappropriate or low-value and demonstrating that certain services produce little/no health benefit. Quantifying “ideal” expected testing rates—ones that would reflect minimization of inappropriate/low-value care without excluding essential, high-yield diagnostic services—presents additional challenges. Consequently, of 521 unique measures specified by national measurement programs and professional guidelines, 91.6% targeted underuse, while only 6.5% targeted overuse.1
The potential for unintended consequences of implementing measures to eliminate overuse are a barrier to incorporating such measures into practice.2 For example, measuring, reporting, and penalizing overuse of inappropriate bone scanning may lead to underuse in patients for whom scanning is crucial.2 Most overuse measures based on inappropriate or low-value indications relate to imaging and medications.1 However, there is increasing interest in overutilization measures based on a broad set of health services. Identifying low-value testing or treatments often requires a substantial degree of clinical detail to avoid the damaging inclusion of beneficial services, which may lead to unintended negative outcomes, creating skepticism among clinicians. Ultimately, getting measurement of low-value care wrong would undermine adoption of interventions to reduce overuse.
To reduce low-value care through expansive measures of provider ordering behavior,3 Ellenbogen et al4 derived a novel index to identify hospitals with high rates of low-yield diagnostic testing. This index is based on the concept that, in the presence of nonspecific, symptom-based principal diagnoses, a substantial proportion of (apparently) non-diagnostic related studies were probably ordered despite a low pretest probability of serious disease. Since such symptom-based diagnoses reflect the absence of a more specific diagnosis, the examinations observed are markers of physician-driven decisions leading to discretionary utilization likely to be of low-value to patients. This study fills a critical gap in dual measures of appropriateness and yield, rather than simply utilization, to advance the Choosing Wisely campaign.3
Advantages of this overuse index include its derivation from administrative data, obviating the need for electronic health records, and incorporation of diagnostic yield at the inpatient-encounter level. One study selected procedures identifiable solely with claims from a set deemed overused by professional/consumer groups.5 However, the yield of physician decisions in specific cases was not measured. In contrast, this novel index is derived from an assessment of diagnostic yield.4 Although test results are not known with certainty, the absence of a specific discharge diagnosis serves as a test result proxy. Measurement of diagnostic examination yield at the patient-level (aggregated to the hospital-level) may be applicable across hospitals with varied patient populations, which include large differences in patient and/or family preferences to seek medical attention and engage in shared decision-making. The role that patient preferences play in decisions creates a limitation in this index—while decisions for the candidate diagnostic tests are physician driven, patient demand may be a confounding factor. This index cannot therefore be considered purely a measure of physician-induced intensity of diagnostic services. Patient-reported data would enhance future analyses by more fully capturing all dimensions of care necessary to identify low-value services. Subjective outcomes are critical in completely measuring the aggregate benefits of tests and interventions judged low-value based on objective metrics. Such data would also aid in quantifying the relative contributions of patient and physician preferences in driving discretionary utilization.
Finally, the derived index is restricted to diagnostic decision-making and may not be applicable to treatment-related practice patterns. However, the literature suggests strong correlations between diagnostic and therapeutic intensity. Application of this novel index will play an important role in reducing low-value discretionary utilization.
1. Newton EH, Zazzera EA, Van Moorsel G, Sirovich BE. Undermeasuring overuse--an examination of national clinical performance measures. JAMA Intern Med. 2015;175(10):1709-1711. https://doi.org/10.1001/jamainternmed.2015.4025
2. Mathias JS, Baker DW. Developing quality measures to address overuse. JAMA. 2013;309(18):1897-1898. https://doi.org/10.1001/jama.2013.3588
3. Bhatia RS, Levinson W, Shortt S, et al. Measuring the effect of Choosing Wisely: an integrated framework to assess campaign impact on low-value care. BMJ Qual Saf. 2015;24(8):523-531. https://doi.org/10.1136/bmjqs-2015-004070
4. Ellenbogen MI, Prichett L, Johnson PT, Brotman DJ. Development of a simple index to measure overuse of diagnostic testing at the hospital level using administrative data. J Hosp Med. 2021;16:xxx-xxx. https://doi.org/10.12788/jhm.3547
5. Segal JB, Bridges JF, Chang HY, et al. Identifying possible indicators of systematic overuse of health care procedures with claims data. Med Care. 2014;52(2):157-163. https://doi.org/10.1097/MLR.0000000000000052
1. Newton EH, Zazzera EA, Van Moorsel G, Sirovich BE. Undermeasuring overuse--an examination of national clinical performance measures. JAMA Intern Med. 2015;175(10):1709-1711. https://doi.org/10.1001/jamainternmed.2015.4025
2. Mathias JS, Baker DW. Developing quality measures to address overuse. JAMA. 2013;309(18):1897-1898. https://doi.org/10.1001/jama.2013.3588
3. Bhatia RS, Levinson W, Shortt S, et al. Measuring the effect of Choosing Wisely: an integrated framework to assess campaign impact on low-value care. BMJ Qual Saf. 2015;24(8):523-531. https://doi.org/10.1136/bmjqs-2015-004070
4. Ellenbogen MI, Prichett L, Johnson PT, Brotman DJ. Development of a simple index to measure overuse of diagnostic testing at the hospital level using administrative data. J Hosp Med. 2021;16:xxx-xxx. https://doi.org/10.12788/jhm.3547
5. Segal JB, Bridges JF, Chang HY, et al. Identifying possible indicators of systematic overuse of health care procedures with claims data. Med Care. 2014;52(2):157-163. https://doi.org/10.1097/MLR.0000000000000052
© 2021Society of Hospital Medicine
Healthcare System Stress Due to Covid-19: Evading an Evolving Crisis
During the early phase of the novel coronavirus disease 2019 (COVID-19) epidemic in the United States, public health strategies focused on “flattening the curve” to ensure that healthcare systems in hard-hit regions had the ability to care for surges of acutely ill patients. Now, COVID-19 cases and hospitalizations are rising sharply throughout the country, and many healthcare systems are facing intense strain due to an influx of patients.
In this issue of JHM, Horwitz et al provide important insights on evolving inpatient care and healthcare system strain for patients with COVID-19. The authors evaluated 5,121 adults hospitalized with SARS-CoV-2 infection at a 3-hospital health system in New York City from March through August 2020,1 and found that patients hospitalized later during the time period were much younger and had fewer comorbidities. Importantly, the authors observed a marked decline in adjusted in-hospital mortality or hospice rates, from 25.6% in March to 7.6% in August.
What might explain the dramatic improvement in risk-adjusted mortality? The authors’ use of granular data from the electronic health record allowed them to account for temporal changes in demographics and clinical severity of hospitalized patients, indicating that other factors have contributed to the decline in adjusted mortality. One likely explanation is that increasing clinical experience in the management of patients with COVID-19 has resulted in the delivery of better inpatient care, while the use of evidence-based therapies for COVID-19 has also grown. Although important gains have been made in treatment, the care of patients with COVID-19 largely remains supportive. But supportive care requires an adequate number of hospital beds, healthcare staff, and sufficient critical care resources, at minimum.
Healthcare system strain has undoubtedly played a critical role in the outcomes of hospitalized patients. Horwitz et al found that the number of COVID-19 hospitalizations in March and April, when death rates were highest, was more than 10 times greater than in July and August, when death rates were lowest. As noted in the early epidemic in China, COVID-19 death rates partially reflect access to high-quality medical care.2 And, in the US, hospitals’ capacity to care for critically ill patients with COVID-19 is an important predictor of death.3
As COVID-19 cases now surge across the country, ensuring that healthcare systems have the resources needed to care for patients will be paramount. Unfortunately, the spread of COVID-19 is exponential, while hospitals’ ability to scale-up surge capacity over a short timeframe is not. Already, reports are emerging across the country of hospitals reaching bed capacity and experiencing shortages of physicians and nurses.
To curtail escalating healthcare system stress in the coming months, we must minimize the cluster-based super-spreading that drives epidemic surges. Approximately 15% to 20% of infected cases account for up to 80% of disease transmission.4 Therefore, strategies must address high-risk scenarios that involve crowding, close prolonged contact, and poor ventilation, such as weddings, sporting events, religious gatherings, and indoor dining and bars.
Without adequate testing or tracing capacity during viral surges, employing nonpharmaceutical interventions to mitigate spread is key. Japan, which created the “3 Cs” campaign (avoid close contact, closed spaces, and crowds), utilized a response framework that specifically targeted super-spreading. The US should follow a similar strategy in the coming months to protect healthcare systems, healthcare workers, and most importantly, our patients.
1. Horwitz LI, Jones SA, Cerfolio RJ, et al. Trends in COVID-19 risk-adjusted mortality rates. J Hosp Med. 2021;16:XXX-XXX. https://doi.org/10.12788/jhm.3552
2. Ji Y, Ma Z, Peppelenbosch MP, Pan Q. Potential association between COVID-19 mortality and health-care resource availability. Lancet Glob Health. 2020;8(4):e480. https://doi.org/10.1016/S2214-109X(20)30068-1
3. Gupta S, Hayek SS, Wang W, et al; STOP-COVID Investigators. Factors associated with death in critically ill patients with coronavirus disease 2019 in the US. JAMA Intern Med. 2020;180(11):1–12. https://doi.org/10.1001/jamainternmed.2020.3596.
4. Sun K, Wang W, Gao L, et al. Transmission heterogeneities, kinetics, and controllability of SARS-CoV-2. Science. 2020;24:eabe2424. https://doi.org/10.1126/science.abe2424
During the early phase of the novel coronavirus disease 2019 (COVID-19) epidemic in the United States, public health strategies focused on “flattening the curve” to ensure that healthcare systems in hard-hit regions had the ability to care for surges of acutely ill patients. Now, COVID-19 cases and hospitalizations are rising sharply throughout the country, and many healthcare systems are facing intense strain due to an influx of patients.
In this issue of JHM, Horwitz et al provide important insights on evolving inpatient care and healthcare system strain for patients with COVID-19. The authors evaluated 5,121 adults hospitalized with SARS-CoV-2 infection at a 3-hospital health system in New York City from March through August 2020,1 and found that patients hospitalized later during the time period were much younger and had fewer comorbidities. Importantly, the authors observed a marked decline in adjusted in-hospital mortality or hospice rates, from 25.6% in March to 7.6% in August.
What might explain the dramatic improvement in risk-adjusted mortality? The authors’ use of granular data from the electronic health record allowed them to account for temporal changes in demographics and clinical severity of hospitalized patients, indicating that other factors have contributed to the decline in adjusted mortality. One likely explanation is that increasing clinical experience in the management of patients with COVID-19 has resulted in the delivery of better inpatient care, while the use of evidence-based therapies for COVID-19 has also grown. Although important gains have been made in treatment, the care of patients with COVID-19 largely remains supportive. But supportive care requires an adequate number of hospital beds, healthcare staff, and sufficient critical care resources, at minimum.
Healthcare system strain has undoubtedly played a critical role in the outcomes of hospitalized patients. Horwitz et al found that the number of COVID-19 hospitalizations in March and April, when death rates were highest, was more than 10 times greater than in July and August, when death rates were lowest. As noted in the early epidemic in China, COVID-19 death rates partially reflect access to high-quality medical care.2 And, in the US, hospitals’ capacity to care for critically ill patients with COVID-19 is an important predictor of death.3
As COVID-19 cases now surge across the country, ensuring that healthcare systems have the resources needed to care for patients will be paramount. Unfortunately, the spread of COVID-19 is exponential, while hospitals’ ability to scale-up surge capacity over a short timeframe is not. Already, reports are emerging across the country of hospitals reaching bed capacity and experiencing shortages of physicians and nurses.
To curtail escalating healthcare system stress in the coming months, we must minimize the cluster-based super-spreading that drives epidemic surges. Approximately 15% to 20% of infected cases account for up to 80% of disease transmission.4 Therefore, strategies must address high-risk scenarios that involve crowding, close prolonged contact, and poor ventilation, such as weddings, sporting events, religious gatherings, and indoor dining and bars.
Without adequate testing or tracing capacity during viral surges, employing nonpharmaceutical interventions to mitigate spread is key. Japan, which created the “3 Cs” campaign (avoid close contact, closed spaces, and crowds), utilized a response framework that specifically targeted super-spreading. The US should follow a similar strategy in the coming months to protect healthcare systems, healthcare workers, and most importantly, our patients.
During the early phase of the novel coronavirus disease 2019 (COVID-19) epidemic in the United States, public health strategies focused on “flattening the curve” to ensure that healthcare systems in hard-hit regions had the ability to care for surges of acutely ill patients. Now, COVID-19 cases and hospitalizations are rising sharply throughout the country, and many healthcare systems are facing intense strain due to an influx of patients.
In this issue of JHM, Horwitz et al provide important insights on evolving inpatient care and healthcare system strain for patients with COVID-19. The authors evaluated 5,121 adults hospitalized with SARS-CoV-2 infection at a 3-hospital health system in New York City from March through August 2020,1 and found that patients hospitalized later during the time period were much younger and had fewer comorbidities. Importantly, the authors observed a marked decline in adjusted in-hospital mortality or hospice rates, from 25.6% in March to 7.6% in August.
What might explain the dramatic improvement in risk-adjusted mortality? The authors’ use of granular data from the electronic health record allowed them to account for temporal changes in demographics and clinical severity of hospitalized patients, indicating that other factors have contributed to the decline in adjusted mortality. One likely explanation is that increasing clinical experience in the management of patients with COVID-19 has resulted in the delivery of better inpatient care, while the use of evidence-based therapies for COVID-19 has also grown. Although important gains have been made in treatment, the care of patients with COVID-19 largely remains supportive. But supportive care requires an adequate number of hospital beds, healthcare staff, and sufficient critical care resources, at minimum.
Healthcare system strain has undoubtedly played a critical role in the outcomes of hospitalized patients. Horwitz et al found that the number of COVID-19 hospitalizations in March and April, when death rates were highest, was more than 10 times greater than in July and August, when death rates were lowest. As noted in the early epidemic in China, COVID-19 death rates partially reflect access to high-quality medical care.2 And, in the US, hospitals’ capacity to care for critically ill patients with COVID-19 is an important predictor of death.3
As COVID-19 cases now surge across the country, ensuring that healthcare systems have the resources needed to care for patients will be paramount. Unfortunately, the spread of COVID-19 is exponential, while hospitals’ ability to scale-up surge capacity over a short timeframe is not. Already, reports are emerging across the country of hospitals reaching bed capacity and experiencing shortages of physicians and nurses.
To curtail escalating healthcare system stress in the coming months, we must minimize the cluster-based super-spreading that drives epidemic surges. Approximately 15% to 20% of infected cases account for up to 80% of disease transmission.4 Therefore, strategies must address high-risk scenarios that involve crowding, close prolonged contact, and poor ventilation, such as weddings, sporting events, religious gatherings, and indoor dining and bars.
Without adequate testing or tracing capacity during viral surges, employing nonpharmaceutical interventions to mitigate spread is key. Japan, which created the “3 Cs” campaign (avoid close contact, closed spaces, and crowds), utilized a response framework that specifically targeted super-spreading. The US should follow a similar strategy in the coming months to protect healthcare systems, healthcare workers, and most importantly, our patients.
1. Horwitz LI, Jones SA, Cerfolio RJ, et al. Trends in COVID-19 risk-adjusted mortality rates. J Hosp Med. 2021;16:XXX-XXX. https://doi.org/10.12788/jhm.3552
2. Ji Y, Ma Z, Peppelenbosch MP, Pan Q. Potential association between COVID-19 mortality and health-care resource availability. Lancet Glob Health. 2020;8(4):e480. https://doi.org/10.1016/S2214-109X(20)30068-1
3. Gupta S, Hayek SS, Wang W, et al; STOP-COVID Investigators. Factors associated with death in critically ill patients with coronavirus disease 2019 in the US. JAMA Intern Med. 2020;180(11):1–12. https://doi.org/10.1001/jamainternmed.2020.3596.
4. Sun K, Wang W, Gao L, et al. Transmission heterogeneities, kinetics, and controllability of SARS-CoV-2. Science. 2020;24:eabe2424. https://doi.org/10.1126/science.abe2424
1. Horwitz LI, Jones SA, Cerfolio RJ, et al. Trends in COVID-19 risk-adjusted mortality rates. J Hosp Med. 2021;16:XXX-XXX. https://doi.org/10.12788/jhm.3552
2. Ji Y, Ma Z, Peppelenbosch MP, Pan Q. Potential association between COVID-19 mortality and health-care resource availability. Lancet Glob Health. 2020;8(4):e480. https://doi.org/10.1016/S2214-109X(20)30068-1
3. Gupta S, Hayek SS, Wang W, et al; STOP-COVID Investigators. Factors associated with death in critically ill patients with coronavirus disease 2019 in the US. JAMA Intern Med. 2020;180(11):1–12. https://doi.org/10.1001/jamainternmed.2020.3596.
4. Sun K, Wang W, Gao L, et al. Transmission heterogeneities, kinetics, and controllability of SARS-CoV-2. Science. 2020;24:eabe2424. https://doi.org/10.1126/science.abe2424
© 2021 Society of Hospital Medicine
Sexual Harassment and Gender Discrimination in Hospital Medicine: A Call to Action
Hospitalists are known as change agents for their fierce patient advocacy and expertise in hospital systems redesign. The field of hospital medicine has claimed numerous successes and the hospitalist model has been embraced by institutions across the country. Yet the lived experiences of hospitalists surveyed by Bhandari et al in this month’s issue of JHM suggest a grim undertone.1 Hospital medicine is a field with high physician burnout rates, stark gender inequities in pay, leadership, and academic opportunities, and an unacceptably high prevalence of sexual harassment and gender discrimination. Women hospitalists disproportionately bear the brunt of these inequities. All hospitalists, however, can and should be an integral part of the path forward by recognizing the impact of these inequities on colleagues and hospital systems.
The study by Bhandari et al adds to the increasing body of knowledge documenting high levels of sexual harassment and gender discrimination in medicine and highlights important gender differences in these experiences among hospitalists nationally.1,2 Among 336 respondents across 18 academic institutions, sexual harassment and gender discrimination were both common and highly problematic within the field of hospital medicine, confirming what prior narratives have only anecdotally shared. Both men and women experienced harassment, from patients and colleagues alike, but women endured higher levels compared with men on all the measures studied.1
Qualitative comments in this study are noteworthy, including one about a hospitalist’s institution allowing potential faculty to be interviewed about plans for pregnancy, childcare, and personal household division of labor. One might argue that this knowledge is necessary for shift-based inpatient work in the context of a worldwide pandemic in which pregnant workers are likely at higher risk of increased morbidity and mortality. It remains illegal, however, to ask such questions, which are representative of the types of characteristics that constitute a toxic workplace environment. Moreover, such practices are particularly problematic given that pregnancy and childbearing for women in medicine come with their own set of well-documented unique challenges.3
The considerable body of research in this field should help guide new research priorities and targets for intervention. Does the experience of sexual harassment impact hospitalists’ intentions to leave their institutions or the career as a whole? Does sexual harassment originating from colleagues or from patients and families affect patient safety or quality of care? Do interventions in other international hospital settings specifically targeting respectfulness translate to American hospitals?4 These questions and a host of others merit our attention.
Hospital system leaders should work with hospital medicine leaders to support wholesale institutional cultural transformation. Implementation of antiharassment measures recommended in the 2018 report on sexual harassment from the National Academies of Sciences, Engineering, and Medicine is critical.2 This means supporting diverse, inclusive, and respectful environments at all levels within the organization, improving transparency and accountability for how incidents are handled, striving for strong and diverse leadership, providing meaningful support for targets of harassment, measuring prevalence over time, and encouraging professional societies to adopt similar actions. Furthermore, we believe it is critical to adopt a zero-tolerance policy for harassing behaviors and to hold individuals accountable. Encouraging all individuals within health care systems to uphold their ethical obligations to combat harassment and bias on a personal level is important.5 If left unaddressed, the unmet needs of those who are subjected to harassment and bias will continue to be problematic for generations to come, with detrimental effects throughout healthcare systems and the broader populations they serve.
1. Bhandari S, Jha P, Cooper C, Slawski B. Gender-based discrimination and sexual harassment among academic internal medicine hospitalists. J Hosp Med. 2021;16:XXX-XXX. https://doi.org/10.12788/jhm.3561
2. National Academies of Sciences, Engineering, and Medicine. Sexual harassment of women: climate, culture, and consequences in academic sciences, engineering, and medicine. National Academies Press; 2018. https://doi.org/10.17226/24994
3. Stentz NC, Griffith KA, Perkins E, Jones RD, Jagsi R. Fertility and childbearing among American female physicians. J Womens Health (Larchmt). 2016;25(10):1059-1065. https://doi.org/10.1089/jwh.2015.5638
4. Leiter MP, Laschinger HKS, Day A, Oore DG. The impact of civility interventions on employee social behavior, distress, and attitudes. J Appl Psychol. 2011;96(6):1258-1274. https://doi.org/10.1037/a0024442
5. Mello MM, Jagsi R. Standing up against gender bias and harassment - a matter of professional ethics. N Engl J Med. 2020;382(15):1385-1387. https://doi.org/10.1056/nejmp1915351
Hospitalists are known as change agents for their fierce patient advocacy and expertise in hospital systems redesign. The field of hospital medicine has claimed numerous successes and the hospitalist model has been embraced by institutions across the country. Yet the lived experiences of hospitalists surveyed by Bhandari et al in this month’s issue of JHM suggest a grim undertone.1 Hospital medicine is a field with high physician burnout rates, stark gender inequities in pay, leadership, and academic opportunities, and an unacceptably high prevalence of sexual harassment and gender discrimination. Women hospitalists disproportionately bear the brunt of these inequities. All hospitalists, however, can and should be an integral part of the path forward by recognizing the impact of these inequities on colleagues and hospital systems.
The study by Bhandari et al adds to the increasing body of knowledge documenting high levels of sexual harassment and gender discrimination in medicine and highlights important gender differences in these experiences among hospitalists nationally.1,2 Among 336 respondents across 18 academic institutions, sexual harassment and gender discrimination were both common and highly problematic within the field of hospital medicine, confirming what prior narratives have only anecdotally shared. Both men and women experienced harassment, from patients and colleagues alike, but women endured higher levels compared with men on all the measures studied.1
Qualitative comments in this study are noteworthy, including one about a hospitalist’s institution allowing potential faculty to be interviewed about plans for pregnancy, childcare, and personal household division of labor. One might argue that this knowledge is necessary for shift-based inpatient work in the context of a worldwide pandemic in which pregnant workers are likely at higher risk of increased morbidity and mortality. It remains illegal, however, to ask such questions, which are representative of the types of characteristics that constitute a toxic workplace environment. Moreover, such practices are particularly problematic given that pregnancy and childbearing for women in medicine come with their own set of well-documented unique challenges.3
The considerable body of research in this field should help guide new research priorities and targets for intervention. Does the experience of sexual harassment impact hospitalists’ intentions to leave their institutions or the career as a whole? Does sexual harassment originating from colleagues or from patients and families affect patient safety or quality of care? Do interventions in other international hospital settings specifically targeting respectfulness translate to American hospitals?4 These questions and a host of others merit our attention.
Hospital system leaders should work with hospital medicine leaders to support wholesale institutional cultural transformation. Implementation of antiharassment measures recommended in the 2018 report on sexual harassment from the National Academies of Sciences, Engineering, and Medicine is critical.2 This means supporting diverse, inclusive, and respectful environments at all levels within the organization, improving transparency and accountability for how incidents are handled, striving for strong and diverse leadership, providing meaningful support for targets of harassment, measuring prevalence over time, and encouraging professional societies to adopt similar actions. Furthermore, we believe it is critical to adopt a zero-tolerance policy for harassing behaviors and to hold individuals accountable. Encouraging all individuals within health care systems to uphold their ethical obligations to combat harassment and bias on a personal level is important.5 If left unaddressed, the unmet needs of those who are subjected to harassment and bias will continue to be problematic for generations to come, with detrimental effects throughout healthcare systems and the broader populations they serve.
Hospitalists are known as change agents for their fierce patient advocacy and expertise in hospital systems redesign. The field of hospital medicine has claimed numerous successes and the hospitalist model has been embraced by institutions across the country. Yet the lived experiences of hospitalists surveyed by Bhandari et al in this month’s issue of JHM suggest a grim undertone.1 Hospital medicine is a field with high physician burnout rates, stark gender inequities in pay, leadership, and academic opportunities, and an unacceptably high prevalence of sexual harassment and gender discrimination. Women hospitalists disproportionately bear the brunt of these inequities. All hospitalists, however, can and should be an integral part of the path forward by recognizing the impact of these inequities on colleagues and hospital systems.
The study by Bhandari et al adds to the increasing body of knowledge documenting high levels of sexual harassment and gender discrimination in medicine and highlights important gender differences in these experiences among hospitalists nationally.1,2 Among 336 respondents across 18 academic institutions, sexual harassment and gender discrimination were both common and highly problematic within the field of hospital medicine, confirming what prior narratives have only anecdotally shared. Both men and women experienced harassment, from patients and colleagues alike, but women endured higher levels compared with men on all the measures studied.1
Qualitative comments in this study are noteworthy, including one about a hospitalist’s institution allowing potential faculty to be interviewed about plans for pregnancy, childcare, and personal household division of labor. One might argue that this knowledge is necessary for shift-based inpatient work in the context of a worldwide pandemic in which pregnant workers are likely at higher risk of increased morbidity and mortality. It remains illegal, however, to ask such questions, which are representative of the types of characteristics that constitute a toxic workplace environment. Moreover, such practices are particularly problematic given that pregnancy and childbearing for women in medicine come with their own set of well-documented unique challenges.3
The considerable body of research in this field should help guide new research priorities and targets for intervention. Does the experience of sexual harassment impact hospitalists’ intentions to leave their institutions or the career as a whole? Does sexual harassment originating from colleagues or from patients and families affect patient safety or quality of care? Do interventions in other international hospital settings specifically targeting respectfulness translate to American hospitals?4 These questions and a host of others merit our attention.
Hospital system leaders should work with hospital medicine leaders to support wholesale institutional cultural transformation. Implementation of antiharassment measures recommended in the 2018 report on sexual harassment from the National Academies of Sciences, Engineering, and Medicine is critical.2 This means supporting diverse, inclusive, and respectful environments at all levels within the organization, improving transparency and accountability for how incidents are handled, striving for strong and diverse leadership, providing meaningful support for targets of harassment, measuring prevalence over time, and encouraging professional societies to adopt similar actions. Furthermore, we believe it is critical to adopt a zero-tolerance policy for harassing behaviors and to hold individuals accountable. Encouraging all individuals within health care systems to uphold their ethical obligations to combat harassment and bias on a personal level is important.5 If left unaddressed, the unmet needs of those who are subjected to harassment and bias will continue to be problematic for generations to come, with detrimental effects throughout healthcare systems and the broader populations they serve.
1. Bhandari S, Jha P, Cooper C, Slawski B. Gender-based discrimination and sexual harassment among academic internal medicine hospitalists. J Hosp Med. 2021;16:XXX-XXX. https://doi.org/10.12788/jhm.3561
2. National Academies of Sciences, Engineering, and Medicine. Sexual harassment of women: climate, culture, and consequences in academic sciences, engineering, and medicine. National Academies Press; 2018. https://doi.org/10.17226/24994
3. Stentz NC, Griffith KA, Perkins E, Jones RD, Jagsi R. Fertility and childbearing among American female physicians. J Womens Health (Larchmt). 2016;25(10):1059-1065. https://doi.org/10.1089/jwh.2015.5638
4. Leiter MP, Laschinger HKS, Day A, Oore DG. The impact of civility interventions on employee social behavior, distress, and attitudes. J Appl Psychol. 2011;96(6):1258-1274. https://doi.org/10.1037/a0024442
5. Mello MM, Jagsi R. Standing up against gender bias and harassment - a matter of professional ethics. N Engl J Med. 2020;382(15):1385-1387. https://doi.org/10.1056/nejmp1915351
1. Bhandari S, Jha P, Cooper C, Slawski B. Gender-based discrimination and sexual harassment among academic internal medicine hospitalists. J Hosp Med. 2021;16:XXX-XXX. https://doi.org/10.12788/jhm.3561
2. National Academies of Sciences, Engineering, and Medicine. Sexual harassment of women: climate, culture, and consequences in academic sciences, engineering, and medicine. National Academies Press; 2018. https://doi.org/10.17226/24994
3. Stentz NC, Griffith KA, Perkins E, Jones RD, Jagsi R. Fertility and childbearing among American female physicians. J Womens Health (Larchmt). 2016;25(10):1059-1065. https://doi.org/10.1089/jwh.2015.5638
4. Leiter MP, Laschinger HKS, Day A, Oore DG. The impact of civility interventions on employee social behavior, distress, and attitudes. J Appl Psychol. 2011;96(6):1258-1274. https://doi.org/10.1037/a0024442
5. Mello MM, Jagsi R. Standing up against gender bias and harassment - a matter of professional ethics. N Engl J Med. 2020;382(15):1385-1387. https://doi.org/10.1056/nejmp1915351
© 2021 Society of Hospital Medicine
Missed Opportunities for Transitioning to Oral Antibiotic Therapy
Historically, bacterial infections in hospitalized children were treated with intravenous (IV) antibiotics for the duration of therapy—frequently with placement of a vascular catheter. Risks associated with vascular catheters and the limitations they impose on a child’s quality of life are increasingly being recognized—including thrombi, catheter dislodgement, and secondary infections as catheters provide a portal of entry for bacteria into the bloodstream (ie, catheter-associated bloodstream infections) or along the catheter wall (ie, phlebitis). This potential for harm underscores the importance of transitioning to oral antibiotic therapy whenever possible.
In this issue of the Journal of Hospital Medicine, Cotter et al used an administrative database to investigate opportunities to transition from IV to oral antibiotics for patients across multiple pediatric hospitals.1 Their novel metric, “percent opportunity,” represents the percent of days that there was the opportunity to transition from IV to oral antibiotics. They found that over 50% of the time, IV antibiotics could have been switched to equivalent oral agents. Furthermore, there was wide variability across institutions in IV-to-oral transitioning practices; 45% of the variation was seemingly attributable to institution-level preferences.
The large sample size and multicenter nature of this study improve its external validity. However, using administrative data to make assumptions about clinical decision-making has limitations. The definition of opportunity days assumes that any day a child receives other enteral medications provides an “opportunity” to prescribe oral antibiotics instead. This does not account for other reasonable indications to continue IV therapy (eg, endocarditis) and may overestimate true opportunities for conversion to oral therapy. Alternatively, their conservative approach of excluding days when a child received both IV and oral antibiotics may underestimate opportunities for oral transition. Regardless of the precision of their estimates, their findings highlight that there is room to improve the culture of transitioning hospitalized children from IV to oral antibiotic therapy.
Admittedly, the evidence for clinically effective conversion to oral therapy in children remains incomplete. Data support oral antibiotics for hospitalized children with pneumonia, cellulitis, pyelonephritis, and osteoarticular infections—even with associated bacteremia.2 There is also evidence for successful conversion to oral therapy for complicated appendicitis, retropharyngeal abscesses, mastoiditis, and orbital cellulitis.2
The decision to transition to oral therapy does not need to be delayed until the time of hospital discharge because each additional day of IV therapy poses a cumulative risk. Rather, prescribers should apply a structured approach, such as the “Four Moments of Antibiotic Decision Making,” on a daily basis for every hospitalized child receiving antibiotics to prompt timely decisions about discontinuing IV therapy, narrowing IV therapy, or transitioning from IV to oral antibiotic therapy.3 We applaud Cotter et al for shedding light on an area in need of standardization of care, which could optimize patient outcomes and minimize harm for a large number of children.1 The “percent opportunity” to switch from IV to oral antibiotic therapy is a promising antibiotic stewardship metric, and its association with clinical outcomes merits further investigation.
1. Cotter JM, Hall M, Girdwood ST, et al. Opportunities for stewardship in the transition from intravenous to enteral antibiotics in hospitalized pediatric patients. J Hosp Med. 2021;16:XXX-XXX. https://doi.org/10.12788/jhm.3538
2 McMullan BJ, Andresen D, Blyth CC, et al. Antibiotic duration and timing of the switch from intravenous to oral route for bacterial infections in children: systematic review and guidelines. Lancet Infect Dis. 2016;16(8):e139-e152. https://doi.org/ 10.1016/S1473-3099(16)30024-X
3. Tamma PD, Miller MA, Cosgrove SE. Rethinking how antibiotics are prescribed: incorporating the 4 moments of antibiotic decision making into clinical practice. JAMA. 2019;321(2):139-140. https://doi.org/ 10.1001/jama.2018.19509
Historically, bacterial infections in hospitalized children were treated with intravenous (IV) antibiotics for the duration of therapy—frequently with placement of a vascular catheter. Risks associated with vascular catheters and the limitations they impose on a child’s quality of life are increasingly being recognized—including thrombi, catheter dislodgement, and secondary infections as catheters provide a portal of entry for bacteria into the bloodstream (ie, catheter-associated bloodstream infections) or along the catheter wall (ie, phlebitis). This potential for harm underscores the importance of transitioning to oral antibiotic therapy whenever possible.
In this issue of the Journal of Hospital Medicine, Cotter et al used an administrative database to investigate opportunities to transition from IV to oral antibiotics for patients across multiple pediatric hospitals.1 Their novel metric, “percent opportunity,” represents the percent of days that there was the opportunity to transition from IV to oral antibiotics. They found that over 50% of the time, IV antibiotics could have been switched to equivalent oral agents. Furthermore, there was wide variability across institutions in IV-to-oral transitioning practices; 45% of the variation was seemingly attributable to institution-level preferences.
The large sample size and multicenter nature of this study improve its external validity. However, using administrative data to make assumptions about clinical decision-making has limitations. The definition of opportunity days assumes that any day a child receives other enteral medications provides an “opportunity” to prescribe oral antibiotics instead. This does not account for other reasonable indications to continue IV therapy (eg, endocarditis) and may overestimate true opportunities for conversion to oral therapy. Alternatively, their conservative approach of excluding days when a child received both IV and oral antibiotics may underestimate opportunities for oral transition. Regardless of the precision of their estimates, their findings highlight that there is room to improve the culture of transitioning hospitalized children from IV to oral antibiotic therapy.
Admittedly, the evidence for clinically effective conversion to oral therapy in children remains incomplete. Data support oral antibiotics for hospitalized children with pneumonia, cellulitis, pyelonephritis, and osteoarticular infections—even with associated bacteremia.2 There is also evidence for successful conversion to oral therapy for complicated appendicitis, retropharyngeal abscesses, mastoiditis, and orbital cellulitis.2
The decision to transition to oral therapy does not need to be delayed until the time of hospital discharge because each additional day of IV therapy poses a cumulative risk. Rather, prescribers should apply a structured approach, such as the “Four Moments of Antibiotic Decision Making,” on a daily basis for every hospitalized child receiving antibiotics to prompt timely decisions about discontinuing IV therapy, narrowing IV therapy, or transitioning from IV to oral antibiotic therapy.3 We applaud Cotter et al for shedding light on an area in need of standardization of care, which could optimize patient outcomes and minimize harm for a large number of children.1 The “percent opportunity” to switch from IV to oral antibiotic therapy is a promising antibiotic stewardship metric, and its association with clinical outcomes merits further investigation.
Historically, bacterial infections in hospitalized children were treated with intravenous (IV) antibiotics for the duration of therapy—frequently with placement of a vascular catheter. Risks associated with vascular catheters and the limitations they impose on a child’s quality of life are increasingly being recognized—including thrombi, catheter dislodgement, and secondary infections as catheters provide a portal of entry for bacteria into the bloodstream (ie, catheter-associated bloodstream infections) or along the catheter wall (ie, phlebitis). This potential for harm underscores the importance of transitioning to oral antibiotic therapy whenever possible.
In this issue of the Journal of Hospital Medicine, Cotter et al used an administrative database to investigate opportunities to transition from IV to oral antibiotics for patients across multiple pediatric hospitals.1 Their novel metric, “percent opportunity,” represents the percent of days that there was the opportunity to transition from IV to oral antibiotics. They found that over 50% of the time, IV antibiotics could have been switched to equivalent oral agents. Furthermore, there was wide variability across institutions in IV-to-oral transitioning practices; 45% of the variation was seemingly attributable to institution-level preferences.
The large sample size and multicenter nature of this study improve its external validity. However, using administrative data to make assumptions about clinical decision-making has limitations. The definition of opportunity days assumes that any day a child receives other enteral medications provides an “opportunity” to prescribe oral antibiotics instead. This does not account for other reasonable indications to continue IV therapy (eg, endocarditis) and may overestimate true opportunities for conversion to oral therapy. Alternatively, their conservative approach of excluding days when a child received both IV and oral antibiotics may underestimate opportunities for oral transition. Regardless of the precision of their estimates, their findings highlight that there is room to improve the culture of transitioning hospitalized children from IV to oral antibiotic therapy.
Admittedly, the evidence for clinically effective conversion to oral therapy in children remains incomplete. Data support oral antibiotics for hospitalized children with pneumonia, cellulitis, pyelonephritis, and osteoarticular infections—even with associated bacteremia.2 There is also evidence for successful conversion to oral therapy for complicated appendicitis, retropharyngeal abscesses, mastoiditis, and orbital cellulitis.2
The decision to transition to oral therapy does not need to be delayed until the time of hospital discharge because each additional day of IV therapy poses a cumulative risk. Rather, prescribers should apply a structured approach, such as the “Four Moments of Antibiotic Decision Making,” on a daily basis for every hospitalized child receiving antibiotics to prompt timely decisions about discontinuing IV therapy, narrowing IV therapy, or transitioning from IV to oral antibiotic therapy.3 We applaud Cotter et al for shedding light on an area in need of standardization of care, which could optimize patient outcomes and minimize harm for a large number of children.1 The “percent opportunity” to switch from IV to oral antibiotic therapy is a promising antibiotic stewardship metric, and its association with clinical outcomes merits further investigation.
1. Cotter JM, Hall M, Girdwood ST, et al. Opportunities for stewardship in the transition from intravenous to enteral antibiotics in hospitalized pediatric patients. J Hosp Med. 2021;16:XXX-XXX. https://doi.org/10.12788/jhm.3538
2 McMullan BJ, Andresen D, Blyth CC, et al. Antibiotic duration and timing of the switch from intravenous to oral route for bacterial infections in children: systematic review and guidelines. Lancet Infect Dis. 2016;16(8):e139-e152. https://doi.org/ 10.1016/S1473-3099(16)30024-X
3. Tamma PD, Miller MA, Cosgrove SE. Rethinking how antibiotics are prescribed: incorporating the 4 moments of antibiotic decision making into clinical practice. JAMA. 2019;321(2):139-140. https://doi.org/ 10.1001/jama.2018.19509
1. Cotter JM, Hall M, Girdwood ST, et al. Opportunities for stewardship in the transition from intravenous to enteral antibiotics in hospitalized pediatric patients. J Hosp Med. 2021;16:XXX-XXX. https://doi.org/10.12788/jhm.3538
2 McMullan BJ, Andresen D, Blyth CC, et al. Antibiotic duration and timing of the switch from intravenous to oral route for bacterial infections in children: systematic review and guidelines. Lancet Infect Dis. 2016;16(8):e139-e152. https://doi.org/ 10.1016/S1473-3099(16)30024-X
3. Tamma PD, Miller MA, Cosgrove SE. Rethinking how antibiotics are prescribed: incorporating the 4 moments of antibiotic decision making into clinical practice. JAMA. 2019;321(2):139-140. https://doi.org/ 10.1001/jama.2018.19509
© 2021 Society of Hospital Medicine
Leadership & Professional Development: The Delicate Dance of Yes and No
“Success starts with saying yes. Saying no maintains it.”
—Anonymous
You have just received an opportunity that seems worthwhile. However, you already have a lot on your plate. What do you do? The balance of when to say “yes” and when to say “no” to opportunities, projects, and collaborations is often challenging, especially for busy clinicians. There is a trend, with good basis, to encourage individuals to say “no” more often. While there is much to be said for that, many good opportunities can be missed that way. As Amy Poehler put it, “Saying ‘yes’ doesn’t mean I don’t know how to say no.”
So how does one arrive at a good balance?
DEFINE GOALS AT EACH STAGE OF YOUR CAREER
Most importantly, figure out who you are, what you want your “brand” to be and where you envision your career going. This is likely the most difficult step. Start with a roadmap and recalibrate as your career unfolds. Early in your career, seek breadth rather than depth.
As your career progresses, the “yes-no” balance may shift. We recommend you say “yes” frequently early on. Be open to opportunities that come up, even if they do not perfectly align with your goals. Explore opportunities beyond the limits of your job description. After all, opportunities beget more opportunities. Consider “stretch opportunities.” If you are offered an opportunity that you may not have 100% of the skills for—and is, therefore, a “stretch”—but which aligns with your career goals, do not turn it down. Consider saying “yes” and learn on the job. A mentor or coach can help you navigate these decisions.
CONSIDER THE MANY REASONS TO SAY “YES” OR “NO”
Sometimes, it is important to say “yes” as part of being a “good citizen” in your department. Examples include mentoring learners, serving on a safety committee, teaching student lectures, or coaching a colleague. Often it is possible to align service with career goals.
Another consideration is the benefit of networking: developing alliances and building bridges. In addition to the service or productivity that come with projects or collaborations, these can be powerful networking opportunities. Networking broadly, both within and beyond your field of practice and within and outside your institution, is an important way to create “bonding capital” and “bridging capital,” ie, relationships based on your commonalities and relationships built across differences, respectively.1
Remember, when you say “yes,” you must deliver: every time, on time, and with excellence. When saying “yes” to more opportunities starts to impact your ability to deliver for what you have already committed to, it is time to say “no.” This will help you maintain balance, avoid burnout, and stay focused.
CONSIDER IMPACT VS EFFORT
When juggling a busy schedule, consider effort vs impact. There are many low-effort opportunities that have relatively high impact. For instance, as a junior faculty member interested in medical education, participating in a grading committee is low effort but can help you understand the process, connect you with educational leaders, and open doors to future opportunities. An effective strategy may be to incorporate a combination of low-effort and high-effort activities at any one time, while considering the impact of each, to help maintain balance. The effort-vs-impact balance may shift as you grow in your career.
CONCLUSION
Know where you are going, explore the opportunities that may get you there, and recalibrate often. The path to success is typically a circuitous one, so enjoy the journey and give it your all every step of the way.
1. Clark D. Start networking with people outside your industry. Harvard Bus Rev. October 20, 2016. Accessed December 11, 2020. https://hbr.org/2016/10/start-networking-with-people-outside-your-industry
“Success starts with saying yes. Saying no maintains it.”
—Anonymous
You have just received an opportunity that seems worthwhile. However, you already have a lot on your plate. What do you do? The balance of when to say “yes” and when to say “no” to opportunities, projects, and collaborations is often challenging, especially for busy clinicians. There is a trend, with good basis, to encourage individuals to say “no” more often. While there is much to be said for that, many good opportunities can be missed that way. As Amy Poehler put it, “Saying ‘yes’ doesn’t mean I don’t know how to say no.”
So how does one arrive at a good balance?
DEFINE GOALS AT EACH STAGE OF YOUR CAREER
Most importantly, figure out who you are, what you want your “brand” to be and where you envision your career going. This is likely the most difficult step. Start with a roadmap and recalibrate as your career unfolds. Early in your career, seek breadth rather than depth.
As your career progresses, the “yes-no” balance may shift. We recommend you say “yes” frequently early on. Be open to opportunities that come up, even if they do not perfectly align with your goals. Explore opportunities beyond the limits of your job description. After all, opportunities beget more opportunities. Consider “stretch opportunities.” If you are offered an opportunity that you may not have 100% of the skills for—and is, therefore, a “stretch”—but which aligns with your career goals, do not turn it down. Consider saying “yes” and learn on the job. A mentor or coach can help you navigate these decisions.
CONSIDER THE MANY REASONS TO SAY “YES” OR “NO”
Sometimes, it is important to say “yes” as part of being a “good citizen” in your department. Examples include mentoring learners, serving on a safety committee, teaching student lectures, or coaching a colleague. Often it is possible to align service with career goals.
Another consideration is the benefit of networking: developing alliances and building bridges. In addition to the service or productivity that come with projects or collaborations, these can be powerful networking opportunities. Networking broadly, both within and beyond your field of practice and within and outside your institution, is an important way to create “bonding capital” and “bridging capital,” ie, relationships based on your commonalities and relationships built across differences, respectively.1
Remember, when you say “yes,” you must deliver: every time, on time, and with excellence. When saying “yes” to more opportunities starts to impact your ability to deliver for what you have already committed to, it is time to say “no.” This will help you maintain balance, avoid burnout, and stay focused.
CONSIDER IMPACT VS EFFORT
When juggling a busy schedule, consider effort vs impact. There are many low-effort opportunities that have relatively high impact. For instance, as a junior faculty member interested in medical education, participating in a grading committee is low effort but can help you understand the process, connect you with educational leaders, and open doors to future opportunities. An effective strategy may be to incorporate a combination of low-effort and high-effort activities at any one time, while considering the impact of each, to help maintain balance. The effort-vs-impact balance may shift as you grow in your career.
CONCLUSION
Know where you are going, explore the opportunities that may get you there, and recalibrate often. The path to success is typically a circuitous one, so enjoy the journey and give it your all every step of the way.
“Success starts with saying yes. Saying no maintains it.”
—Anonymous
You have just received an opportunity that seems worthwhile. However, you already have a lot on your plate. What do you do? The balance of when to say “yes” and when to say “no” to opportunities, projects, and collaborations is often challenging, especially for busy clinicians. There is a trend, with good basis, to encourage individuals to say “no” more often. While there is much to be said for that, many good opportunities can be missed that way. As Amy Poehler put it, “Saying ‘yes’ doesn’t mean I don’t know how to say no.”
So how does one arrive at a good balance?
DEFINE GOALS AT EACH STAGE OF YOUR CAREER
Most importantly, figure out who you are, what you want your “brand” to be and where you envision your career going. This is likely the most difficult step. Start with a roadmap and recalibrate as your career unfolds. Early in your career, seek breadth rather than depth.
As your career progresses, the “yes-no” balance may shift. We recommend you say “yes” frequently early on. Be open to opportunities that come up, even if they do not perfectly align with your goals. Explore opportunities beyond the limits of your job description. After all, opportunities beget more opportunities. Consider “stretch opportunities.” If you are offered an opportunity that you may not have 100% of the skills for—and is, therefore, a “stretch”—but which aligns with your career goals, do not turn it down. Consider saying “yes” and learn on the job. A mentor or coach can help you navigate these decisions.
CONSIDER THE MANY REASONS TO SAY “YES” OR “NO”
Sometimes, it is important to say “yes” as part of being a “good citizen” in your department. Examples include mentoring learners, serving on a safety committee, teaching student lectures, or coaching a colleague. Often it is possible to align service with career goals.
Another consideration is the benefit of networking: developing alliances and building bridges. In addition to the service or productivity that come with projects or collaborations, these can be powerful networking opportunities. Networking broadly, both within and beyond your field of practice and within and outside your institution, is an important way to create “bonding capital” and “bridging capital,” ie, relationships based on your commonalities and relationships built across differences, respectively.1
Remember, when you say “yes,” you must deliver: every time, on time, and with excellence. When saying “yes” to more opportunities starts to impact your ability to deliver for what you have already committed to, it is time to say “no.” This will help you maintain balance, avoid burnout, and stay focused.
CONSIDER IMPACT VS EFFORT
When juggling a busy schedule, consider effort vs impact. There are many low-effort opportunities that have relatively high impact. For instance, as a junior faculty member interested in medical education, participating in a grading committee is low effort but can help you understand the process, connect you with educational leaders, and open doors to future opportunities. An effective strategy may be to incorporate a combination of low-effort and high-effort activities at any one time, while considering the impact of each, to help maintain balance. The effort-vs-impact balance may shift as you grow in your career.
CONCLUSION
Know where you are going, explore the opportunities that may get you there, and recalibrate often. The path to success is typically a circuitous one, so enjoy the journey and give it your all every step of the way.
1. Clark D. Start networking with people outside your industry. Harvard Bus Rev. October 20, 2016. Accessed December 11, 2020. https://hbr.org/2016/10/start-networking-with-people-outside-your-industry
1. Clark D. Start networking with people outside your industry. Harvard Bus Rev. October 20, 2016. Accessed December 11, 2020. https://hbr.org/2016/10/start-networking-with-people-outside-your-industry
© 2021 Society of Hospital Medicine