What Factors Influence Online Ratings for Academic Orthopaedic Surgeons?
Brandon E. Earp MD, Nattaly E. Greene BSN, Kyra A. Benavent, BS, Tamara D. Rozental, MD
The authors report no conflict of interest related to this work.
©2020 by The Orthopaedic Journal at Harvard Medical School
Many physician-rating websites exist for patients to evaluate their health care providers. While studies have examined online ratings of orthopaedic surgeons, few have been able to identify factors that may directly contribute to patient ratings and reviews. Our aim was to identify specific factors: academic, demographic or professional, that may influence patient ratings of orthopaedic surgeons. Online searches of three popular physician rating websites were performed for faculty orthopaedic surgeons from 4 academic, tertiary care centers, and patient ratings were obtained. Additionally, demographic and professional data were recorded. The ratings were compared using Student’s t-test and Chi square tests for physicians with low (<4) and high (≥4) ratings on each site. Among 165 physicians, there were 14,569 reviews, with 28 women (17%) and 137 men (83%); 137 (83%) categorized their race as Caucasian. All orthopaedic subspecialties were represented, and all providers had at least one rating. Age, sex, race, years in practice, education and subspecialty did not influence the overall physician rating or the number of ratings. Senior academic ranks were associated with higher ratings at two out of three sites. Physicians with consistently low ratings across three sites had a lower academic rank and were less likely to have attended a top medical school. In our physician population, education and academic rank seem to have the most impact. Most importantly, reviews are not consistent across sites. Physicians should be aware of factors that influence ratings and continue to explore measures to best evaluate patient satisfaction.
LEVEL OF EVIDENCE Level V Prognostic Study
KEYWORDS Patient reviews, physician rating website, orthopedic surgeon, social media, online ratings
Online physician ratings and social media are evolving and becoming increasingly common tools for patients selecting members of their health care team.1 A plethora of physician-rating websites (PRWs) has emerged, allowing patients to evaluate their physician and review ratings provided by prior patients. A survey of the U.S population found that 65% of respondents were aware of PRWs, with 19% of respondents reporting these sites as “very important” to choosing a physician; similar findings have been demonstrated in European countries, with almost 75% of respondents reporting awareness of PRWs.2-4 Healthgrades, a leading PRW, boasts over 30 million visits per month.5
Several orthopaedic specialties have sought to understand how online ratings influence patient satisfaction.6,7 A study among spine surgeons identified trustworthiness as the most significant predictor of positive ratings from patients.8 Similarly, a review of 250 hand surgeon ratings found a positive correlation between a higher number of ratings and stronger online presence with positive reviews.9 Neither study was able to identify a relationship between specific physician factors such as age or experience and patient satisfaction. Furthermore, the link between online ratings and level of expertise was not explored. Many studies have found that patient reviews can be inconsistent across different websites.2-4,6,9
We reviewed ratings from the three most popular PRWs for 165 faculty affiliated with a large academic orthopaedic surgery program. Our null hypothesis was that there would be no identifiable factors associated with online physician ratings.
The list of physician faculty members affiliated with an orthopaedic residency program was obtained from the residency website. This included full time staff at 4 large tertiary care centers, as well as community affiliates. Three popular online rating websites were subsequently searched for each individual physician:
These sites were selected because they are among the most visited for physician ratings and provide detailed information without the need for a paid subscription. All searches were conducted in October of 2018.
Demographic information such as age, sex, race, years in practice, subspecialty, academic rank and leadership positions was collected. The education profile for each faculty member was reviewed to determine whether the physician attended a top 10 medical school according to the US News and World Report (USNWR) rankings and whether they completed post-graduate training in a top 20 USNWR hospital.10,11
The overall rating for each physician was recorded from each website and physicians were separated into those with low ratings (<4 out of 5, on average) or high ratings (≥4 out of 5, on average). The ratings for each physician were then compared across rating sites.
Statistical methods
Data are shown as mean ±standard deviation (SD) for continuous and n (%) for categorical. We compared the physicians with low ratings vs. high ratings using Student’s t-test and Chi square tests. We then used the group of physicians with all 3 ratings and compared the physicians who were “low” (<4) on each rating vs. those who were high on at least 1 rating. Next, we computed all pair-wise concordance among the 3 ratings on our high/low dichotomy with the Kappa statistic. Values <0.2 indicate minimal agreement, 0.2-0.4 fair, 0.41-0.60 as moderate and >0.6 as substantial.12 We then tested the agreement among all 3 ratings using the publicly available % magree macro in SAS using a multi-rater kappa statistic. All analysis was used with SAS version 9.4 (Cary, NC).
The demographics for our patient population are presented in Table 1. There were 165 physicians, with 28 women (17%) and 137 men (83%). One hundred and thirty-seven (83%) categorized their race as Caucasian. All orthopaedic specialties were represented. Thirty-eight (23.1%) physicians graduated from a top 10 medical school and 80 (48.4%) completed post-graduate training at a top 20 hospital according to the USWNR rankings for 2018 Best Hospitals Honor Roll.10,11
All providers had at least one rating on one of the PRWs. Vitals.com is specifically targeted to physician reviews, Healthgrades.com allows reviews of both hospitals and providers, and USnews.com is even broader, allowing reviews within and outside of healthcare.
There was a total of 14,569 reviews across all three sites. On Healthgrades.com, 156 (94.5%) of the physicians were rated, on USnews.com 142 (86%) were rated, and on Vitals.com 156 (94.5%) were rated. On Healthgrades.com, 55 physicians (35.3%) had an average score <4 and 101 (64.7%) had an average score of ≥4. On USnews.com, 111 (78.2%) had a low average score and 31 (21.8%) had a high average score. On Vitals.com, 50 (32.1%) had a low average rating and 106 (67.9%) had a high average rating. The mean rating was 4.2 on Healthgrades.com, 4.1 on Vitals.com and 3.1 on USnews.com. Table 2 presents the comparisons between physicians with low and high ratings for each of the three sites. Age, sex, race, years in practice, education and subspecialty did not influence the overall physician rating or the number of low or high ratings. Senior academic ranks (associate professor and professor) were associated with higher ratings at two out of three sites.
We subsequently compared the 140 physicians who were rated at all three PRWs. Those physicians with low ratings (n=27) across all three sites were compared to those with at least one high rating (n=113) at one of the sites. Physicians with consistent low ratings had a lower academic rank (P=0.03) and were less likely to have attended a top medical school (P=0.01) (Table 3).
The agreement between ratings at the three sites was assessed. While Healthgrades.com and Vitals.com had a similar number of physicians with low ratings (n=55 and n=50, respectively, P=0.55), USnews.com had a much higher proportion of physicians with low ratings (n=111, P<0.0001). Comparing ratings ≥4 between sites, agreement ranged from 29 to 50% and kappa values revealed weak correlation in ratings between sites. Healthgrades.com and Vitals.com demonstrated the best correlation (κ=0.34) (Table 4).
Online evaluations and reviews have been utilized for aiding consumers in the purchasing and selection of goods and non-health care services, with rates reported as high as 87% and 71%, respectively.2 While prior work by Lagu et al. in 2010 demonstrated that few physicians were reviewed via online ratings sites,13 more recent publications have demonstrated increasing numbers of physicians rated and increasing number of ratings per physician, as rating websites have spread rapidly into the health care sector.14 In our study, ratings were ubiquitous, with all providers having at least one rating on one of the three PRWs chosen.
Although several past publications noted low rates of awareness and usage of online rating sites,15-17 our findings parallel more recent work by Hanauer et al. in 2014, demonstrating that patients are increasingly aware of online rating sites.2 Indeed 59% of respondents reported that ratings sites are either “somewhat important” or “very important” when making choices about physicians. These sites thus have significant impact and influence; of respondents who reported using online sites, 35% selected a provider based on good ratings and 37% chose to not see a provider with bad ratings.
Our study found that age, sex, race, years in practice, education and subspecialty were not significantly associated with the overall physician rating or the number of low or high ratings. Senior academic ranks, however, were associated with higher ratings on two of the three sites investigated. Furthermore, lower academic rank and having attended a medical school outside the top 10 ranking were significantly associated with low ratings across all three rating sites, compared to those who had a high rating at one of the three sites. These findings may be a reflection of the regional economy being heavily weighted towards and influenced by medicine, technology, and higher education.18,19 Prior publications in the orthopaedic literature have explored the influence of a variety of factors on online physician ratings. Several largely non-modifiable surgeon characteristics have been assessed. Frost et al. in 2013 reported on a national sampling of orthopedic surgeons’ ratings and found that surgeons in academic practice had significantly higher ratings, but found no differences based on gender or geography.20 Similarly, within the spine subspecialty, Zhang et al. evaluated 219 spine surgeons and found that being in practice for 20 years or less and being in academic practice were associated with higher ratings.21 Others have demonstrated disparate findings: Ramkumar et al. showed no differences in online ratings between academic vs. nonacademic providers and between arthroplasty vs. non-arthroplasty surgeons.22
PRWs incorporate multiple variables in their ratings. Burn et al. evaluated how accurately 14 different websites rated physicians based on physician-specific characteristics.23 Their study found that of the questions used to determine ratings, only 28% directly rated the physician, 48% rated both the physician and the office, and 24% rated the office alone. Trehan et al. evaluated the ratings and comments of 250 hand surgeons in 2016 and demonstrated that positive overall ratings were associated with a higher number of ratings, Castle Connolly status, and increased online surgeon presence.9 Donnally et al. also evaluated the association of online comments with online ratings and found that spine surgeons were more likely to receive favorable reviews for factors pertaining to outcomes, likeability/character, and negative reviews based on ancillary staff interactions, billing, and office environment.4 They recommended that surgeons take an active role in modifying factors patients perceive as negative, even if not directly related to physicians themselves.
Interestingly, we found little consistency for physician ratings across the three websites. Indeed, the agreement between ratings at the three sites was also assessed and revealed generally poor agreement. Though relatively weak, Healthgrades.com and Vitals.com demonstrated the best correlation (κ=0.34), with a similar number of physicians with low ratings, USnews.com had a much higher proportion of physicians with low ratings. Of the ratings ≥4 between the three sites, agreement ranged from 29 to 50% and kappa values revealed weak correlation. This variability certainly adds another challenge to the validity of PRWs. Potential explanations for this discrepancy between sites include variable methodologies used in determining the ratings, underlying differences in each site’s participating audience, and the relatively low numbers of reviews for some providers.
According to Ramkumar et al., hospital rating systems may be able to solve this issue. By using variations of standard surveys such as Press Ganey, patient reviews may be more accurately obtained.22 Recently, some providers and institutions have begun making internal patient satisfaction ratings publicly available. This may have the benefit of ensuring that only actual patients are participating in the ratings, and that the questions asked are measuring factors which institutions and providers deem important. Ricciardi et al. compared provider-initiated patient satisfaction ratings with commercial online physician rating websites for orthopaedic surgeons and found that provider-initiated ratings showed a greater number of responses, were higher in overall ratings of patient satisfaction, and had a lower percentage of negative comments.24 Although patients may worry about the anonymity of their responses, it seems that hospitals and providers should take a more active role in collecting meaningful patient satisfaction data.
There are several limitations with our study. Our sample of physicians was limited to providers associated with the orthopaedic surgery departments of several academic medical centers in a single metropolitan area and findings may not be generalizable to all geographic regions, surgeons or clinical practice scenarios. The data were collected during a single month and are only reflective of that time period; ratings systems are continually being updated and a different time period might reveal different data. Second, we chose to focus on the overall physician rating per site rather than the individual categories and sub-ratings for each individual. Third, given that the websites are public and anonymous, some of the reviews may be inaccurate, and there is no feature to “verify” that these reviews are from real patients. Fourth, the overall physician rating categories were narrow and varied according to website. Finally, we did not have a minimum or maximum number of reviews required for each surgeon included in this study.
In conclusion, online rating sites have become a pervasive tool utilized throughout society as people make decisions on purchasing and selection of services, including health care. In our patient population, education and academic rank seem to have the most impact. Perhaps most importantly, reviews are not consistent across sites. To combat this, a standard survey may be useful for validating and collecting patient reviews. Physicians should develop more awareness of the factors that influence ratings and contribute to the evolving discussion of how best to measure patient satisfaction.