Science management continues to be obsessed by so-called objective measurements of progress. Research institutions want to show how competitive they are and increase their prestige, policymakers want to know whose advice to trust, funders want to back only the “best research groups”. But the use of quantitative measures or “metrics” as key elements in decision making has raised a wide variety of objections. Back in 1975 Charles Goodhart encapsulated the dilemma here in his law “When a measure becomes a target, it ceases to be a good measure”. Goodhart’s law is essentially the sociological analogue of Heisenberg’s uncertainty principle in quantum mechanics, predicting that measuring a system usually disturbs it. And it may not only disturb it in expected ways but spawn unpredictable side effects. The result of defining metrics that carry economic or social significance is to encourage people to “game” the measures for greater rewards.
In inventing the citation index Eugene Garfield provided a simple measure of success and helped develop a new field dedicated to simplistic, often misleading and frequently misused bibliometric indices. With the journal impact factor (JIF) and its many variations, the H-factor for individuals, and even time series to extract trends from these data there is apparently little need to actually read the papers to make a judgement. The most recent addition to this plethora of markers is now altimetrics, supposedly capturing the ways in which articles are disseminated through social media. Every article in this journal is subject to all these measures and whilst it might indicate some sort of success in productivity, and even in dissemination, it cannot show directly the value of the article.
What of these altimetric measures? Can we believe that their collection is even useful? For example the number of Tweets – one of the altimetric measures- is said to be a valuable indication of dissemination despite the conflicting evidence that the papers with most Tweets are normally not about key science and that the number of Tweets can be artificially inflated. Rather in the same way as Facebook “likes” grow, Tweets can be multiplied rapidly without the senders ever seeing the original article or understanding its contents.
The UK recently published an independent analysis of metrics to highlight what was valuable and what was misleading (http://www.hefce.ac.uk/media/HEFCE,2014/Content/Pubs/Independentresearch/2015/The,Metric,Tide/2015_metric_tide.pdf). The review was wide ranging and came up with some very clear recommendations. A key one is that peer review must remain the primary method for assessment and that any other indicators need to be tailored to the subject and assessed only in context. There is no one-size-fits-all. So in considerations of doctorates, promotion, tenure, funding or priority where value is the issue rather than productivity the current metrics can be very misleading, especially JIF. And setting a target minimum publication rate per researcher, as some universities and research centres do, is an indication of poor management, substituting output for understanding and highlighting production over value.
Some may complain that this portrayal of the use of bibliometric tools is unfair and that they are never used on their own. Yet they carry weight for job applicants, for promotion boards, for departmental rankings and even in Asia in gaining a PhD. All those who read this will know that the positive use of your data and publications by the leaders in your field is worth more than any bibliometric score in assessing the value of your scientific contribution.
The British review offers ideas which are international in their application and should be widely considered. Let us not be seduced by the bibliometric indices but see them as merely adjuncts to a proper understanding of value from actually reading the contributions themselves.