The Wolf at the Door…Furthermore about Rankings

As I wrote in my previous posting, (“The Wolf at the Door: A Parable about Ratings”) there are at least four tests that a good system of rating meets: it is objective and transparent; it tests a truly representative sample; it proves to be a valid predictor of some outcome of which we care; and its categories differentiate experience in a way that is statistically significant (could not be due to chance.)

Ratings simply bunch players into a category (AAA, AA, A, and so on.) And they are everywhere in society. Meat inspections, safety inspections, and T.V. viewer ratings probably meet the four tests. Grading of students by instructors is a form of rating–done well, it conforms to the four criteria. “Star” ratings on Amazon.com or Rottentomatoes.com flunk most of the criteria, as do online ratings of instructors by students.

Rankings list the players in some order of priority. We can apply the same four tests to rankings as we can to ratings. The New York Times list of best-selling books probably passes the tests of objectivity and representativeness; we could challenge it on the basis of significance and validity: the list is a measure of sales volume or popularity, not quality. Are the weight loss and self-help books that rise to the top of best-seller lists really the best literature that civilization affords?

Investment banking league tables are rankings too. Many bankers have issues with the way these league tables are constructed—for instance, how is credit awarded when there are two or more advisers? These league tables can be challenged on the basis of objectivity, significance, and validity. A recent critique in the Wall Street Journal noted that the league tables are measures of activity, not results.

Then we have business school rankings. In time, the story of the wolf and three pigs might apply here too. Take a moment to consider the four criteria:

**Objectivity and transparency: Very few of the B-school rankings are replicable by outsiders. Many of the rankings rely on arbitrary scoring of the schools on various criteria. And pity the poor school that fills in the questionnaire incorrectly or incompletely—in the history of school ratings, some of the raters have simply made up the data rather than collect accurate data from respondents. ((The President of Sara Lawrence College revealed that the U.S. News & World Report made up data that was missing for SLC. See, “The Cost of Bucking College Rankings.” In 2007, Fortune magazine completely omitted UNC-Chapel Hill’s B-school from the ranking, prompting a dean to catalogue the “the shoddy, inaccurate and inappropriate research methods employed” after which Fortune acknowledged that UNC was “incorrectly ranked.”))

**Representativeness: Virtually none of the B-school rankings warrant that the samples on which their surveys are based are representative of the larger population of alums, recruiters, or deans on whom they draw.

**Validity: What do the rankings measure? Is what they measure of any interest to those who care about excellence in management education? Do the rankings truly measure quality? Quality of what? Ideally, the rankings would measure the quality of the learning experience.

**Significance: None of the B-school rankings publish measures of variation on the underlying data, such as the standard deviation. In the absence of such statistics, it is impossible to tell whether the differences among the ranking categories are significant. For instance, is being ranked #16 significantly different from being ranked #10, or #1? We simply don’t know.

Apparently no one ((See reports by UNESCO and the European Center for Higher Education, Dean Andrew Policano, and AACSB.)) who has taken a deep dive into the B-school rankings thinks they meet the smell test. In an assessment of rankings, a task force of the AACSB concluded, “Measures used in media rankings are often arbitrary, selected based on convenience, and definitely controversial. Characteristics that are of little importance are often included, while important characteristics are excluded because they are more difficult to measure. Even when the measures do correlate with quality, media attempts to draw significant differences among similar programs are inappropriate. Indeed, weights that are applied to different characteristics to determine ranks are subjective and generally not justified. Two additional problems plague the rankings data. First, the data itself can be expensive for schools to provide. Schools can’t afford not to participate, and many have had to hire additional staff to respond to the increasing number of media requests for data. Although there is substantial overlap in the types of MBA data collected, each media survey requests some unique data and applies different definitions. The end result is that schools spend an extraordinary amount of time preparing data for media surveys. Second, the data reported to and published by the media are inconsistent. The lack of formal definitions and verification processes, combined with the highly visible and influential role of data in rankings, has been a recipe for highly implausible data. This task force believes that media rankings have had other more serious negative impacts on business education. Because rankings of full-time MBA programs are commonly presented under the label of “best b-schools,” the public has developed a narrow definition about the breadth and value of business education. This diminishes the importance of faculty research, undergraduate programs, and doctoral education and compels schools to invest more heavily in highly-visible MBA programs. Many schools have reallocated resources to activities that can enhance its ranking, such as marketing campaigns, luxurious facilities for a small number of MBA students, and concierge services for recruiters; but these gestures have little to do with quality. The result is an increase in the cost of delivering an MBA program, which generally translates to higher tuition for students. Rankings that rely on student or recruiter satisfaction can favor surface-level changes over substantive improvements. Similarly, rankings based on formulas that include student “selectivity” motivate schools to shrink entering classes and reduce diversity to “pump-up” statistics, such as average GMAT scores.”

The best we can say is that the rankings are simply data, not necessarily knowledge, wisdom, or absolute Truth. One might look at several rankings to gain a sense of the field. But as an insider to that field I see vast differences among the schools that the rankings don’t capture and that account for considerable variation in the learning experiences of students. My message to applicants: there is no substitute for on-the-ground research; you must do your own homework.

I pay attention to rankings because people I care about pay attention to them. All of Darden’s stakeholders want to be part of an enterprise of consequence. The rankings are one indicator of Darden’s impact. We steer the school by our mission and vision, not by our rankings. Steering by rankings would be like a CEO steering a business from each day’s closing stock price—what Warren Buffett calls “driving in the rear-view mirror.” We will not allow the rankings to dictate who we are. Fortunately, the rankings treat Darden relatively well. ((AACSB estimates that there are some 10,000 schools world-wide that award degrees in business. And the AACSB has accredited about 500 MBA-granting schools in the U.S. There, Darden ranks #4 in Forbes, #10 in Wall Street Journal, #12 in the Economist, #12 in U.S. News & World Report, #16 in BusinessWeek, and #16 in Financial Times. With an average ranking of 12, Darden is in the top 3% of AACSB-accredited U.S. B-schools.)) But I share the critics’ concerns. Any possible benefits from the rankings depend crucially on the quality of the measures. The publications have yet to persuade me about the quality of their measures and the meaningfulness of the rankings. Flawed metrics can lead to flawed decisions, as my previous posting (“The Wolf at the Door”) argues. Belatedly, we grieve the damage to the global economy from flawed debt ratings. Let us treat with similar caution the sustained effect of rankings on management education.

Posted by Robert Bruner at 12/29/2008 06:15:41 PM