More and more institutions are signing up to responsible metrics manifestos such as DORA – which is great. This is no doubt influenced by funder demands that they do so – which is also great. And these manifestos are having a positive impact on researcher-level evaluation – which is triply great. But, as we all know, researcher-level evaluation issues, such as avoiding Journal Impact Factors, are only one element of the sector’s research evaluation problems.
UKRI Chief Executive Ottoline Leyser recently pointed out that any evaluation further up the food-chain in the form of university- or country-level evaluations ultimately has an impact on individual researchers. And of course the most influential of these, at the top of the research evaluation food-chain, are the global university rankings.
So why, I often ask myself, do we laud universities for taking a responsible approach to journal metrics and turn a blind eye to their participation in, and celebration of, the global rankings?
Indeed, when you look at the characteristics of Journal Impact Factors (JIFs) and the characteristics of global university rankings, they both fall foul of exactly the same four critiques.
1. The construction problem
As DORA states, there are significant issues with the calculation of the JIF: the average cites per paper for a journal over two years. Firstly, providing the mean cites-per-paper of a skewed dataset is not statistically sensible. Secondly, whilst the numerator includes all citations to the journal, the denominator excludes ‘non-citable items’ such as editorials and letters – even if they have been cited. Thirdly, the time window of two years is arguably not long enough to capture citation activity in less citation dense fields, as a result you can’t compare a JIF in one field with that from another.
However, global university rankings are subject to even harsher criticisms about their construction. The indicators they use are a poor proxy for the concept they seek to evaluate (the use of staff:student ratios as a proxy for teaching quality for example). The concepts they seek to evaluate are not representative of the work of all universities (societal impacts are not captured at all). The data sources they use are heavily biased towards the global north. They often use sloppy reputation-based opinion polls. And worst of all, they combine indicators together using arbitrary weightings, a slight change in which can have a significant impact on a university’s rank.
2. The validity problem
Construction issues aside, problems with the JIF really began when it was repurposed from an indicator to decide which journals should appear in Garfield’s citation index, to one used by libraries to inform collection development, and then by researchers to choose where to publish and finally by readers (and others) to decide which research was the best for being published there. It had become an invalid proxy for quality, rather than as a means of ensuring the most citations were captured by a citation index.
Whilst the JIF may have inadvertently found itself in this position, some of the global rankings quite deliberately over-state their meaning. Indeed, each of the ‘big three’ global rankings (ARWU, QS and THE WUR) claim to reveal which are the ‘top’ universities (despite using different methods for reaching their different conclusions). However, given the many and varied forms of higher education institutions on the planet, none of these high-profile rankings articulates exactly what their ‘top’ universities are supposed to be top at. The truth is that the ‘top’ universities are mainly top at being old, large, wealthy, English-speaking, research-focused and based in the global north.
3. The application problem
Of course, once we have indicators that are an invalid proxy for the thing they claim to measure (JIFs signifying ’quality’ and rankings signifying ‘excellence’) third parties will make poor use of them for decision-making. Thus, funders and institutions started to judge researchers based on the number of outputs they had in high-JIF journals, as though that somehow reflected on the quality of their research and of them as a researcher.
In a similar way, we know that some of the biggest users of the global university rankings are students seeking to choose where to study (even though no global ranking provides any reliable indication of teaching quality) because who doesn’t want to study at a ‘top’ university? But it’s not just students; institutions and employers are also known to judge applicants based on the rank of their alma mater. Government-funded studentship schemes will also often only support attendance at top 200 institutions.
4. The impact problem
Ultimately, these issues have huge impacts on both individual careers and the scholarly enterprise. The problems associated with the pursuit of publication in high-JIF journals have been well-documented and include higher APC costs, publication delays, publication of only positive findings on hot topics, high retraction rates, and negative impacts on the transition to open research practices.
The problems associated with the pursuit of a high university ranking are less well-documented but are equally, if not more, concerning. At individual level, students can be denied the opportunity to study at their institution of choice and career prospects can be hampered through conscious or unconscious ranking-based bias. At institution level, ranking obsession can lead to draconian hiring, firing and reward practices based on publication indicators. At system level we see increasing numbers of countries investing in ‘world-class university’ initiatives that concentrate resource in a few institutions whilst starving the rest. There is a growing inequity both within and between countries’ higher education offerings that should seriously concern us all.
What to do?
If we agree that global university rankings are an equally problematic form of irresponsible research evaluation as the Journal Impact Factor, we have to ask ourselves why their usage and promotion does not form an explicit requirement of responsible metrics manifestos. An easy answer is that universities are the ’victim’ not the perpetrator of the rankings. However, universities are equally complicit in providing data to, and promoting the outcomes of, global rankings. The real answer is that the rankings are so heavily used by those outside of universities that not to participate would amount to financial and reputational suicide.
Despite this, universities do have both the power and the responsibility to take action on global university rankings that would be entirely in keeping with any claim to practice responsible metrics. This could involve:
- Avoiding setting KPIs based on the current composite global university rankings.
- Avoiding promoting a university’s ranking outcome.
- Avoiding legitimizing global rankings by hosting, attending, or speaking at, ranking-promoting summits and conferences.
- Rescinding membership of ranking-based ‘clubs’ such as the World 100 Reputation Academy.
- Working together with other global universities to redefine university quality (or more accurately, qualities) and to develop better ways of evaluating these.
I recently argued that university associations might develop a ‘Much more than our rank’ campaign. This would serve all universities equally – from those yet to get a foothold on the current rankings, to those at the top. Every university has more to offer than is currently measured by the global university rankings – something that I’m sure even the ranking agencies would admit. Such declarations would move universities from judged to judge, from competitor to collaborator. It would give them the opportunity to redefine and celebrate the diverse characteristics of a thriving university beyond the rankings’ narrow and substandard notions of ‘excellence’.
The time has come for us to extend our definition of responsible metrics to include action with regards to the global university rankings. I’m not oblivious to the challenges, and I am certainly open to dialogue about what this might look like. But, we shouldn’t continue to turn a blind eye to the poor construction, validity, application and impact of global rankings, whilst claiming to support and practice responsible metrics. We have to start somewhere, and we have to do it together, but we need to be brave enough to engage in this conversation.