Branching similar experiments in payu git

Hi again @Aidan,

A request for help please on git branching in payu.

I created a github repository from one of my Miocene run directories. I currently have 3xCO2, 2xCO2 and 1xCO2. These runs are located at:

/home/157/dkh157/ACCESS/mio_v4
/home/157/dkh157/ACCESS/mio_c2
/home/157/dkh157/ACCESS/mio_c1

I have pushed the 3xCO2 folder (which had git settings from payu) to a public repository, where it’s currently sitting as the “master” branch.

Ideally, I want the 2x and 1x CO2 runs to be branches of the same repository. But they’ve already been run for a few hundred years. So, if I branch off my “master”, which is the mio_v4 folder, then that would copy the contents of mio_v4 to a new folder, right? Should I then copy across the contents of mio_c2 and mio_c1 to some newly copied branched folder? Or is that becoming a mess?

Regards,
David

I can’t access the .git subfolders because they’re owned by project y99 of which I am not a member

drwxr-x--- 8 y99 4096 Mar 31 14:13 mio_c1/.git
drwxr-x--- 8 y99 4096 Mar 31 12:25 mio_c2/.git
drwxr-x--- 8 y99 4096 Oct  6  2023 mio_v1/.git
drwxr-x--- 8 y99 8192 Mar  8 16:18 mio_v3/.git
drwxr-x--- 8 y99 4096 May  9 21:25 mio_v4/.git

If you did something like

chmod -R a+rX /home/157/dkh157/ACCESS

it would mean anyone with access to your home directory could read and copy your experiments. If this is not desirable you could use ACLs to give just me access, or copy them elsewhere.

That’s fine I ran the command to enable all to read.

So what you want to do now might be different from what you would have done from the very beginning, but we can start with getting all the control directories in the same repo.

Add other experiments as remotes

Add your 1xCO2 and 2xCO2 repos as remotes to your mio_v4 repo:

git remote add mio_c1 /home/157/dkh157/ACCESS/mio_c1
git remote add mio_c2 /home/157/dkh157/ACCESS/mio_c2

These are remotes like GitHub is a remote. They’re independent repositories that you can choose to interact through commands like fetch, push and pull. Note names I gave the remotes are arbitrary, I just named them the same as the directory where the repo resides.

You can see what remotes are defined using git remote -v

$ git remote -v
mio_c1  /home/157/dkh157/ACCESS/mio_c1 (fetch)
mio_c1  /home/157/dkh157/ACCESS/mio_c1 (push)
mio_c2  /home/157/dkh157/ACCESS/mio_c2 (fetch)
mio_c2  /home/157/dkh157/ACCESS/mio_c2 (push)
origin  git@github.com:dkhutch/access_esm_mio_run.git (fetch)
origin  git@github.com:dkhutch/access_esm_mio_run.git (push)

(In my case origin is because I cloned your repo from that GitHub repo)

Fetch branch information from remotes

fetch branch and other data from your new remotes

git fetch mio_c1
git fetch mio_c2

Now you can see what branches are available on your remotes:

$ git branch -a
* master
  remotes/mio_c1/master
  remotes/mio_c2/master
  remotes/origin/HEAD -> origin/master
  remotes/origin/master

Make local copies of remote branches

Checkout the master branches of the remotes into separate branches within your repo

git checkout mio_c1/master -b mio_c1_master
git checkout mio_c2/master -b mio_c2_master

Again, the branch names are arbitrary.

You could make a named branch for your v4

git checkout master -b mio_v4_master

So now you have a repository that contains named branches for all your experiments

$ git branch -a
  master
  mio_c1_master
  mio_c2_master
* mio_v4_master
  remotes/mio_c1/master
  remotes/mio_c2/master
  remotes/origin/HEAD -> origin/master
  remotes/origin/master

Push new branches to GitHub

I forked your repo to a remote called me and pushed all the branches so you can see how it works:

$ git push me --all 
Enumerating objects: 1992, done.
Counting objects: 100% (1992/1992), done.
Delta compression using up to 48 threads
Compressing objects: 100% (581/581), done.
Writing objects: 100% (1992/1992), 1.08 MiB | 5.00 MiB/s, done.
Total 1992 (delta 1448), reused 1949 (delta 1405), pack-reused 0 (from 0)
remote: Resolving deltas: 100% (1448/1448), done.
To github.com:aidanheerdegen/access_esm_mio_run.git
 * [new branch]      mio_c1_master -> mio_c1_master
 * [new branch]      mio_c2_master -> mio_c2_master
 * [new branch]      mio_v4_master -> mio_v4_master

Repo is here:

https://github.com/aidanheerdegen/access_esm_mio_run/branches

So now you have all your control directories in a single repo, but the outputs and restarts in the archive directory are still stored under directories named for the control directories, e.g. /scratch/.../access-esm/archive/mio_c2.

If you wanted to continue your runs from within a single control repo then you’d have to copy or move the outputs and restarts from their current directory into a new experiment directory, or use the restart option to point to the most recent restartXXX directory.

It is much easier if you’re doing this from the beginning with the new payu branching workflow. I’ll try and provide an example of how that would work next week.

1 Like

Amazing, thanks so much for working through the process in detail. Really appreciate the thorough explanation of the steps.

1 Like