Monday 9 January 2012

H-Factors and Citation Metrics

Interesting article in the Guardian this week about h-indexes, and citation metrics in general. Not a fracking related post, but citation metrics are never far from the mind of any ambitious academic.

http://www.guardian.co.uk/commentisfree/2012/jan/06/bad-science-h-index

Citation metrics are how academics, such as myself, are judged. There are a number of different systems, all of them broadly based on the number of citations you receive. So before I go any further I guess I should explain what a citation is:

Science never operates in a vacuum, we are always utilising, building on, confirming (or disproving) pre-existing theories. So when scientists write academic papers to be published in academic journals, they cite the work of previous scientists on whose work has relevance to the new paper. For instance, to quote from a paper I am writing at the moment: The magnitude of the fracture compliance is usually scaled to the number density and length of the fractures (e.g., Hudson, 1981). At the end of the paper there will be a full list of all the papers cited, giving the journal, volume, page number, etc of each cited paper.

If a paper is good, interesting, exciting, provides a good method that other scientists will use, then it will tend to attract a lot of citations. So in short, the more citations you have, the better scientist you are. Good papers get cited, crap ones get ignored. This is the way the scientific community as a whole pass judgement on the work of each scientist as an individual.

Obviously, this doesn't always go quite to plan - for example a paper may attract a lot of citations for being wrong, so people will cite it as an example of what not to do. You also tend to get so-called 'copycat' citations - lets say a big name author cites a particular paper. Now this paper may not be all that great, the big name author only cited it because he/she was in a bit of a rush and it kind of fitted the bill for something he/she was saying. However, when everyone reads the big name author's paper they see this citation and begin citing it as well, creating a lot of citations for a paper that wasn't actually all that worthy.

Nevertheless, despite these issues, I don't think I've seen a better way of objectively assessing a paper's quality than by counting the number of times it's cited.

The crudest citation metric is simply to count the number of citations you have for all your papers, or the average number of citations per paper. However, a more sophisticated method is the h-index. Your h-index is the number (h) of papers you have that have been cited at least h times. So, in my case, I have 14 publications at present (listed here. By listing them in order of the number of times they have been cited, we can compute my h-index.

Paper NumberYearNumber of Citations
1200816
220099
320106
420105
520074
620094
720104
820112
920112
1020112
1120111
1220110
1320110
1420110

So I have at least 4 papers that have been cited more than 4 times, meaning that my h-index is 4. I do not have 5 papers that have been cited 5 times. Once one of papers 5,6 or 7 get cited one more time, I'll have an h-index of 5 (yay!). My h-index is pretty low, for two reasons - firstly I'm a very young scientist, so most of my papers have only been published in the last couple of years - there hasn't been time for other scientists to read them, use them and then cite them (you'll notice that my most cited papers are pretty much the oldest ones). Secondly, I work in applied geophysics, which historically has a pretty poor citation rate. This is because a paper can have a big impact, and lots of people from BP, Shell, Exxon etc will use it to get oil and gas out of the ground. However, these people don't write papers saying how useful your paper was in helping them do this, they just laugh all the way to the bank. They might thank you in person at a conference, they might even sponsor your research, but they won't write a paper and cite you, meaning that a really significant applied geophysics paper my not be particularly well cited.

This variance between disciplines is often given as one of the major problems with citation metrics, but it doesn't bother me so much, as I'm unlikely to be competing with a biologist (who tend to cite each other a lot, so have much higher h-indexes) for a job any time soon. However, it is true that your h-index can be important when a potential employer is sifting through 50 applicants for one position. While a good h-index alone won't be enough to get you a job, a poor h-index can be enough to see you rejected.

I'll bring this rather rambling post to a close now. Some scientists really loathe citation metrics (have a look the comments section in the Guardian article). Personally, I don't really have a strong view. I can see the validity of the points made against it - there will certainly be individual cases where the h-index does a very poor job of representing ability. However, I think we must accept that there are now too many academics in the world for us all to be assessed as individuals - it would simply take too much time, and be too subjective. So as a general rule, I guess a citation metric provides a decent overview, although we must always be prepared to consider individual cases on their merits in certain situations.

P.S. This issue of subjectivity has inspired me to write a little more. Before citation metrics, the ability to negotiate the (often extremely petty, underhand and subjective) world of academic politics in order to ensure you got a shot at the best jobs and funding opportunities (i.e., that person gave my paper a bad review or slagged it off in their paper, so I'm going to slag off their funding application). While citation metrics aren't perfect, they do at least help guard against this sort of thing.....

1 comment: