Hi Intake Team,
I have a couple of questions about intake datastores. I have run a regional OM3 model using the ACCESS-OM3 infrastructure, which automatically made intake_esm_ds.json files at the end of the experiment. I have two issues which I was hoping you may be able to help with:
- I ran the model in three different folders (slight parameter changes in each case, but same diagnostics and file structures). This means I have three datastores. Is it possible with the current infrastructure to merge three datastores in the notebook where you are analysing them? Alternatively, should I be remaking a datastore by puttting them all in one folder and using the script at the end of Building Intake-ESM datastores of ACCESS model output — COSIMA Cookbook documentation? Can this script pick up output in symbolic links?
- Some of the diagnostics didn’t get picked up by the esm datastore making script. I think this is because they were not named with the
access-om3.mom6.XX.nc convention, instead they are called something like ocean_daily_z.nc. Do you have any suggestions that might allow these files to be added to the datastore?
Thank you!
Claire
1 Like
Okay, this code should work to combine your datastores:
from intake_esm.core import esm_datastore
import pandas as pd
def combine_datastores(*esm_datastores) -> esm_datastore:
"""
Takes a bunch of esm_datastores and returns a combined esm_datastore.
Still very much a first draft but should work for consistently shaped
datastores.
Usage: combine_datastores(esm_datastore_1, esm_datastore_2, ... , esm_datastore_n)
To get your datastores:
```python
import intake
esm_ds1 = intake.open_esm_datastore(
"/path/to/datastore1.json",
columns_with_iterables = ['variable'], # Probably
)
"""
esmcat_dict = esm_datastores[0].esmcat.dict()
df_list = [esm_datastore.df for esm_datastore in esm_datastores]
df = pd.concat(df_list)
return esm_datastore({'esmcat': esmcat_dict, 'df': df})
Once you have a combined datastore, you can save it with
esm_datastore.serialize(
name = "experiment_datastore", # saves to `experiment_datastore.json`
directory = "/path/to/a/dir", # Puts it in directory `/path/to/a/dir`
catalog_type='file'
)
Let me know if you have any issues & I’ll do my best to help!
Cheers, Charles
2 Likes
Awesome, thanks so much @CharlesTurner, that script combined the datastores!
On the files that don’t make it in - I’ve moved all the output files to /g/data/ol01/cy8964/access-om3/archive/8km_jra_ryf_obc_Charrassin/, the ones that don’t make it are ocean_month_z.nc and ocean_month.nc.
Thanks!