How Metrics Affect Peer Review for Academic Jobs and Grants
We live in a time where metrics on scientific publications and citations are abundantly accessible. These metrics give an overview and input to research policy and may be used in the evaluation of research for good—or at least intriguing—reasons. Rather than spending time reading publications, citation counts, h-index, journal impact factor, or similar indicators, are taken as a shortcut to understanding the contribution of a piece of work or the performance of a researcher. In addition to saving time, metrics may be used because they are perceived as good proxies for research quality, and because the reviewers think comparisons based on metrics are more objective, fair and reliable than assessments not informed by any metrics. In a recent study we found that metrics are frequently used when assessing grant proposals and candidates for academic positions. Moreover, researchers with stronger bibliometric track records more often apply these measures in their research evaluations.
A majority used metrics in their reviews
Despite their convenience, metrics are controversial. Peer review is not intended to be based on metrics, rather the expertise of the peers and their understanding of the research under review. To better understand the role of metrics in peer assessments, we used survey data to explore the opinions and practices of researchers across academic fields and contexts in three countries. Both for the review of grant proposals and for the assessments of candidates for academic positions. A large majority of the reviewers indicated that metrics were highly or somewhat important when they identified the best proposals/candidate. In both kinds of assessments, the ‘citation impact of past publications’ and, particularly, the ‘number of publications/productivity’ were emphasized.
There was some variation between the three fields included in the study. Economists relied more on number of publications than the cardiologists and physicists. There was only marginal variation between the countries (the Netherlands, Norway and Sweden). Still, within all three fields and countries there were variations: While a majority found metrics important, a substantial proportion found it not important. In other words, there are divergent views among reviewers on the use of metrics, which may impact their reviews. When serving as reviewers, scholars have much discretionary power, and may choose to use, or not to use metrics, disregarding guidelines (which may encourage or discourage metrics). Hence, the outcome of review processes may vary by the panel members’ individual preferences for metrics.
Highly cited scholars more often used citation metrics in their assessments
Significantly, an emphasis on metrics corresponded with the respondents’ own bibliometric performance. Reviewers with higher bibliometric scores more frequently emphasized metrics when they assessed candidates for academic positions, and especially when they assessed grant proposals. For the latter, the probability of perceiving the applicants’ number of publications and citation impact as ‘highly important’ increased along with the respondents’ number of publications, whether they had top percentile publications, and with their share of top percentile publications.
Are notions of research quality affected?
Even if metrics affect peer review, our data indicate that reviewers distinguish between characteristics of good research and bibliometrics as proxies for an applicant’s or application’s potential future success. Whereas, a large majority reported metrics as important in their reviews, only one-fifth indicated that their conclusion was (partly) based on citation scores or journal impact factor when answering the more general question of what they perceived as the best research in their field. Very few respondents indicated citation scores or journal impact factor as sole indicators of the best research. Hence, there is little indication that they saw quantitative indicators as a deciding factor on what constitutes eminent science.
Impact on research agendas
When applied at the scale of the individual, indicators of number of publications and citation impact have limitations and shortcomings as performance measures. Regardless, our survey indicates extensive use of such indicators at this scale, both for the review of grant proposals and for recruitment to academic positions. Emphasis on metrics in peer assessments may impact research activity and research agendas. Researchers—especially early career researchers —need to take into consideration what kind of research will help them qualify for grants and positions. If there is an emphasis on the quantifiable track records of applicants as opposed to the field of expertise and potential of the applicants, or on in-depth review of the proposed research, we risk basing decisions on future research on past trends, and easily publishable and highly cited topics may get disproportionally more resources. Moreover, we give young scholars (stronger) incentives to do the kind of research that appears ‘safe’ in terms of likeliness for publications and citations—in order to increase their chances of success in competitions for grants and academic positions.
Organizing for fair evaluations
What does this mean for researchers applying for jobs or research grants? When planning and using peer review, one should be aware that reviewers—in particular reviewers who score high on metrics—find metrics to be a good proxy for the future success of projects and candidates. They rely on publication metrics in their assessments despite concerns about the use and misuse of such metrics. More generally, we need a better understanding of why and how metrics are used in different fields of research, and the role metrics play in the development of fields, as well as how the profile of review panels impact emphases on metrics. At this scale, the publication of guidelines such as Leiden Manifesto for measurement of research and the DORA declaration concerning the improper use of the journal impact factor, appear to have not been sufficient.