This post is a follow-up to our COSIMA meeting discussion on how we might keep track of people who have contributed to model development / simulations, along with their preferences for how (or if) they’d like to be acknowledged.
At the meeting, we agreed that it would be valuable to have a more formal way of recognising contributions to model configuration and simulation efforts. COSIMA’s culture of sharing configurations and output has enabled amazing science and helped grow our community — and we’d love to ensure that this collaborative culture continues.
We recognise that the level of attribution people would like will vary depending on their contribution and personal preference. Our proposal is to collate these attribution requests and share a list at a COSIMA meeting every 3 months. This would ensure visibility, encourage consistency, and give recognition to contributors.
We’re seeking input in two ways:
1. Your thoughts on the attribution proposal:
Do you have suggestions or thoughts on the idea?
2. Technical help in implementing it:
We’ve discussed adding a field (or using an existing unused one) to the metadata.yaml file for each simulation. This would be a free text field where the names of contributors could be listed along with any specific attribution requests (e.g. acknowledgement, request for coauthorship, timeframe etc). We would then need a tool to extract these fields from all the metadata.yaml files into a spreadsheet for us to share at quarterly COSIMA meetings.
Aidan
(Aidan Heerdegen, ACCESS-NRI Release Team Lead)
2
Totally agree it is important that contributions are acknowledged appropriately.
From a purely technical point of view, we have a schema for metadata.yaml that we use to validate the file is correct, and use to design software so that it is consistent:
So if you want to add a field we’ll need to update the schema.
For that reason my preference would be to use an existing field, especially if the contribution text is fairly free form.
You might also want to consider how that information is presented. For example is it made obvious through the intake catalogue? Do you want it included in all intake datasets? In which case it might make sense to make it a separate field.
One option is to create DOIs for data for individual experiments. This would then be based on the expectation that researchers cite both the model paper, and the DOI for specific experiment data was used from.
Interesting idea. On a practical point, can that work using the current repository structure where we have one experiment per branch on ACCESS-Community-Hub/access-om3-experiments? Or would we need to have one experiment per repository?
Aidan
(Aidan Heerdegen, ACCESS-NRI Release Team Lead)
5
Revamping this thread after our COSIMA meeting discussion this morning. @hrsdawson@aekiss and I think we need a new field in metadata.yaml called e.g. attribution_request. Would that be possible @Aidan ?
Aidan
(Aidan Heerdegen, ACCESS-NRI Release Team Lead)
7
It’s certainly possible.
Just so I understand, what would the field contain? Is it free text, or do you want it machine readable?
requests different attribution/embargo/collaboration/whatever
and we want it to be readily parsed, I guess we’d need something like this in metadata.yaml, with
a “contributors” dictionary, with
each key being an individual contributor’s name, whose value is a dictionary containing
“email”
“contributions” and
“requests”
keys whose values are free-form text,
e.g.
contributors:
'Marie Curie':
email: m.curie@ens.fr
contributions: 'implemented radium tracer in MOM6'
requests: 'to be offered collaboration on projects using radium tracer'
'Alan Turing':
email: al@bletchley.gov.uk
contributions: 'configured and ran this simulation for my PhD'
requests: 'until Dec 2027, check with me before using this output (so you don't beat me to publication)'
'Grace Hopper':
email: grace@upenn.edu
contributions: 'created configuration for Europa'
requests: 'contact me for assistance before using, to avoid misconfiguration'
Configs will be cloned from ACCESS-NRI GitHub repos (e.g. github.com/ACCESS-NRI/access-om3-configs), where they will already have a contributors entry in metadata.yaml containing the names, contributions and requests of any developers who would like recognition. Users can then reconfigure them for experiments and add their own entries giving their requests.
It’s assumed that the absence of an entry with your name means you’re making no requests of users of the config and resulting data.
Hopefully the contributors lists would not grow so long as to become onerous for people using a given config/dataset to honour everyone’s requests.
Thanks @aekiss and @adele-morrison. I like this idea and think the process that it’s looking to support is super important.
I’m wondering how this processes fit in with the existing CITATION.cff file?
Here’s an example file:
Or perhaps it doesn’t at all? Perhaps it’s only for listing community contributors? Or maybe this is only for people running the configuration? (Development authors being listed in the CITATION.cff? This line gets a bit blurry).
Re: the CITATION.cff file. Can someone that has been more involved in that process speak up for it’s intended longer term use? I think that might be @Aidan@anton@tmcadam? Personally, I’ve always assumed that the CITATION.cff file’s use will be for the model description/evaluation paper(s), but we do use them at the moment when compiling talk/presentations on om3.
On the existing proposal. Is it important that it’s exposed/prompted through intake in some way? If yes, we might want to get some feedback from the intake folks.
Aidan
(Aidan Heerdegen, ACCESS-NRI Release Team Lead)
12
Thanks for the detailed proposal.
From a technical point of view I don’t think it is strictly necessary for the contributors to be a dictionary. It can just be a list of contributors, each of which is a dictionary, e.g.
contributors:
- fullname: Marie Curie
email: m.curie@ens.fr
contributions: implemented radium tracer in MOM6
requests: to be offered collaboration on projects using radium tracer
- fullname: Alan Turing
email: al@bletchley.gov.uk
contributions: configured and ran this simulation for my PhD
requests: until Dec 2027, check with me before using this output (so you don't
beat me to publication)
- fullname: Grace Hopper
email: grace@upenn.edu
contributions: created configuration for Europa
requests: contact me for assistance before using, to avoid misconfiguration
As emails can change ORCID would be a good field to include (maybe optional).
There is an existing contributor role taxonomy. If that looks useful it would be a good idea to have a CRediT field to make this ingestable/usable more widely.
From a licensing point of view, are “requests” binding?