Citing Bytes - Adventures in Data Citation: Report from IDCC 2011

with thanks to www.phdcomics.com

I spent most of last week in Bristol at the 7th International Digital Curation Conference, and had a grand old time talking about data and citations. The first thing I went to was a workshop entitled "Data for Impact: Can research assessment create effective incentives for best practice in data sharing?"

The short answer to this is, yes, but...

There's no denying that the Research Excellence Framework ("REF", for short) impacts on how research is disseminated in this country. An example was given: engineers typically publish their work in conference proceedings that are very well refereed and very competitive, with high impact in the field, internationally. But because these conference proceedings weren't counted in the RAE, the message came back to the engineering departments that they had to publish in high impact journals. So the engineers duly did, with the net result that this (badly) impacted their international standing.

There's the double whammy too, that the REF is essentially a data collection exercise, and the universities put a lot of time and effort into it - but there's no data strategy associated with the REF, and data isn't a part of it!

The REF is very concerned with publications (the number that got mentioned was that publications form 65% of the return), so we had a lot of discussion on how we could piggy-back on publications, and essentially produce "data publications" to get data counted in the REF. (Which is what I'm trying to do at the moment...)

Leaving aside the question of why we're piggy-backing on a centuries-old mechanism for publicizing scientific work (i.e. journals) when we could be taking advantage of this cool new technology to create other solutions; there are other issues associated with this. Sure, we can assign DOIs to all the data we can think of (in suitable, stable repositories, of course), but that doesn't mean they'll be properly cited in the literature. People aren't used to citing data, they haven't understood the benefits of it, and, perhaps most importantly, the metrics aren't there to track data citation!

We talked a fair bit about metrics, specifically, altmetrics as a way of quantifying the impact of a particular piece of work (whether data or not). These haven't really gained any ground when it comes to the REF, mainly as I suspect they lack the critical mass of users using them, though it is early days. There's some really interesting stuff, and I for one will be heading over to total-impact.org and figshare.com in the not too distant future to play with what they've been doing over there.

If we could convince the REF to count data, either as a separate research output, or even as a publication type, then that would be excellent. Sure, there were concerns that if data was a publication type, then it would be ignored in favour of high-impact journal publications (why count your dataset when you've got multiple Nature papers and four slots to report publications in?) but it could make life better for those researchers who never get a Nature paper, because they're so busy looking after their data.

I suspect though that it's too late to get data into the next REF in 2014, but maybe the one after that? Time to start lobbying the high-up people who make those sorts of decisions!

Citing Bytes - Adventures in Data Citation

Tuesday, 13 December 2011

Report from IDCC 2011 - Data for Impact workshop

No comments:

Post a Comment