Hosting experiment repo on Github

With @spencerwong 's guidance, I/we have made some slight modifications to the ACCESS-ESM1.5 release-preindustrial+concentrations configuration to run it in pacemaker mode - where SSTs are restored to specific values each year, but all other fields evolve freely. The experiment is described here: Experiment Proposal: Indo-Pacific pacemakers

In the interests of open science: how should I go about sharing this set-up (and the inputs, which are a netcdf of the prescribed SSTs and a mask netcdf) on github? For context: I have barely used github so I am very sorry but I would need to be walked through the steps for hosting the experiment repo. Posting here because maybe there will be someone else in the same situation one day!

Related to this post but seeking some more detailed guidance: Utilising GitHub effectively for experiments and configurations

2 Likes

@georgyfalster First thing is to add you to the ACCESS-Community-Hub organisation on GitHub and I need your github handle to invite you, please.

1 Like

@georgyfalster We do have a tutorial which includes a section on doing precisely this. Once your account is added to the ACCESS-Community-Hub organisation, first follow the steps under Authorise with Github here, which will authorise your Github account on Gadi. I think the following stuff under Perturbation is stuff you’ve already done.

Once you’ve done that, follow the steps under Push to repo here. It looks like you have multiple branches for the different SST variability scenarios, so you’ll want to use git push <repo_name> --all. Note that the name you set for the remote is the name you’ll have to sub in for <repo_name>. Once this is done, it should be available for others to clone and fork on Github.

If there are any issues, please ping me and I’ll take a look.

1 Like

thanks Claire - my github handle is georgyfalster

Invitation sent, please accept it when you can.

thanks - done (: I’m about to be in the lab for a couple of weeks but in any downtime I’ll have a go at following the instructions @lachlanswhyborn linked above.

thanks for this - one question. So I have four experiments, each with its own control directory. the tutorial says ā€œNow you can create a repository from your perturbation experiment control directory using gh repo createā€. So I will do this four times - once from each model control directory, and in each case with a different repo name?

I only ask cos of your comment ā€˜it looks like you have multiple branches for the different SST variability scenarios…’ which sort of sounded like they could all go in the same repo(?) and/or should the remote name be the same for all 4 experiments?

sorry for the basic git questions :sweat_smile:

Yea having one repo for each variability would work, but not ideal I think- we’d end up with many repos pretty quickly. I’ll do some playing around to see if there’s an easy way to turn each of those control directories into branches of one repository.

1 Like

Ok I think I have a workflow that I think will work. Perform the Authorise with Github and Push to Repo steps in one of the control directories, but don’t do the git push <repo_name> --all step. Once you’ve done that, go to the repository you’ve just created on ACCESS-Community-Hub (should be able to search that organisation by the repo_name you just used) and get the URL of the repository, we’ll call it <repo_URL>. Then do this in each of your other control directories:

  1. git remote add origin <repo_URL>. This tells the current directory to point to the repository on Github as the ā€œmasterā€ in a sense.
  2. To check that this worked, do git remote -v. It should show that URL for fetch and push.
  3. git push --set-upstream origin <branch_name>. This should push the current control directory as a branch to the master with the given name. Say the directory you’re in is the 30% variability reduction case, the name it something like 30pc_SST_variability_reduction.

Hopefully this works, I might have missed a step in there, so as always, let me know if there are problems.

I followed the ā€˜Push to repo’ steps at Running Model Experiments with payu and git for one of my four model control directories but it doesn’t seem to be showing up in the ACCESS Community Hub. Did I maybe miss a step to link my repo to the community hub? The other thing is that the setup steps didn’t go exactly as per the example shown - the ā€˜add a remote’ and ā€˜push commits’ options didn’t appear. The last question to appear was Clone the new repository locally? - to which I answered Y.

The new (empty) repo is on my github georgyfalster/indo-pacific-pacemakers-sst-m60 Ā· GitHub

Hmm, that is strange. Setting the Repository owner stage should have set the owning organisation to ACCESS-Community-Hub. Do you have a screenshot of the terminal you used to do the Push a Repo step? We’ll take a different approach to get your experiments on Github, but it would be useful for us to see where things differed from what we expected to we can adjust our instructions.

It might be easier if you manually create the repo on Github under ACCESS-Community-Hub- so the same thing you did for your own repo, just under the organisation (go to the organisation Github page, there should be a New button on the right under repositories). Probably just call it indo-pacific-pacemakers-sst, and then you can add the scenario designations via the branches. Once you’ve done that, follow those steps I specified with the URL of the new empty repo for the other directories (so 4 times in this case).

oh no sorry I’d already closed the terminal when I saw this.

but just to confirm a couple of things:

  • should I delete the repo I created already before making one of the same name in the ACCESS-Community-Hub (and is there a ā€˜best way’ to do this on gadi?)
  • so I should make this new repo manually on the Github website rather than following the payu steps? and then (still outside gadi/payu, still on the github website) make one branch for each of the 4 experiments? To which I will then push the experiment setups following your instructions?

Sorry - just want to be totally clear I’m doing it right before I go ahead (:

No worries. Yea, should probably delete the old repository (scroll to the bottom of Settings to find that). I think manually making the repository and then pushing the 4 repos as branches will be easiest. Don’t worry about making mistakes, it’s easy enough to remove or clean up new repositories. I can’t see how anything you’d do would negatively affect others, since it’s all going to a new repository. Even when you’re working with other existing repositories, it’s usually pretty easy to fix mistakes- one of the great things about Git.

1 Like

ok thanks I’ll give it a go. Will it be a problem that I’ve already initialised a git repo in one of my model control folders?

I suspect each of the folders will already be git repositories if they were cloned via payu? You can check by doing a git status

Hi @georgyfalster,

Sorry I started writing a reply last week and then had a meeting. Just to comment that if the plan is to do this:

Yea having one repo for each variability would work, but not ideal I think- we’d end up with many repos pretty quickly. I’ll do some playing around to see if there’s an easy way to turn each of those control directories into branches of one repository.

We set up something similar where @spencerwong wrote these handy instructions:

Very similar I think to what @lachlanswhyborn has suggested but I’m aware sometimes it’s handy to see a similar set of instructions with different words.

thanks - this should help because I did indeed fall at the expected hurdle when following @lachlanswhyborn 's directions above - eg running git remote add origin https://github.com/ACCESS-Community-Hub/access-esm1.5-sst-perturbation-indo-pacific.git from one of my model control folders gave me the error remote origin already exists.

But instead should I follow the instructions here? GitHub - ACCESS-Community-Hub/access-esm1.6-dev-experiments: Archive for tracking ACCESS-ESM1.6 development experiments. Refer to the readme for instructions on pushing experiments to this repository ? In which case, the only bit I am not sure about is what exactly is ā€œesm1p6_dev_archiveā€ in git remote add esm1p6_dev_archive git@github.com:ACCESS-Community-Hub/access-esm1.6-dev-experiments? The rest of the steps I think I can figure out. again, sorry to be so slow with this - I always seem to run into the same sorts of errors when trying to use git (eg the one above!!) which is what has always put me off learning to use it properly

You might be able to do git remote remove origin to unlink to the original remote origin, and then allow it to point to a new one. Don’t worry, Git can often be quite finicky even for people that use it a lot e.g. myself. This is something that I’d say is beyond basic Git usage- some people get to go through the usual process of learning Git from basics, learning the structures and how things work under the wood.

In those linked instructions, the esm1p6_dev_archive is a local name for the remote repository. It’s possible to link a local repository (e.g. your experiment directory) to multiple ā€œremotesā€- repositories hosted on Github. It will only become important if you make changes to your configuration and want to push them to Github (and if you have only one remote attached to your repository, I suspect you won’t have to worry about it, I’ve actually never needed multiple remotes for a local repository so I’m not 100% sure).

You should be able to follow the instructions at the link, replacing esm1p6_dev_archive with a name that is meaningful to you (and setting the right URL for your repo). Just be careful if you ever want to make changes- the origin remote will still be linked, and you’ll probably have to specify exactly where you want to push any changes to.

aaccchh sorry so that worked to remove the existing remote BUT git is still being finicky (I must say this process is not increasing my git enthusiasm aha).

Apparently now I don’t have permissions, but I think I’m logged in ok. Do I need to make the branch manually here first? GitHub - ACCESS-Community-Hub/access-esm1.5-sst-perturbation-indo-pacific: Magnitude of tropical Indian and Pacific ocean SST variability altered relative to ACCESS-ESM1.5 preindustrial control

Here’s a screenshot in case it’s something glaringly obvious - but i think I just followed the linked instructions.