Building the UM with my own gcom libraries

Hi all.

I’d like to build a UM with my own gcom libraries as my ACCESS-AM3 test case is returning NaN from the gcom function gcg_r2darrsum

@MartinDix has provided me with the location of the gcom sources and I have run the rose stem suite using

rose stem --group=all  -S MPI_VERSION=\'openmpi/4.1.7\'

after altering the suite STORAGE directive to point to my local data directory.

The suite has generated files in ~/cylc-run/vn7.9_nci_gadi/share/nci_gadi_ifort_mpp/build/lib

So now in the AM3 suite, I add the following to app/fcm_make_um/rose-app.conf

ldflags_overrides_prefix=-L/home/548/pag548/cylc-run/vn7.9_nci_gadi/share/nci_gadi_ifort_mpp/build/lib

Which creates the following entries in fcm-make-as-parsed.cfg

build-atmos.prop{fc.flags-ld} = -L/home/548/pag548/cylc-run/vn7.9_nci_gadi/share/nci_gadi_ifort_mpp/build/lib  -lgcom -qopenmp -lnetcdff -lnetcdf   -qopenmp   -mcmodel=medium -shared-intel   
build-recon.prop{fc.flags-ld} = -L/home/548/pag548/cylc-run/vn7.9_nci_gadi/share/nci_gadi_ifort_mpp/build/lib  -lgcom -qopenmp -lnetcdff -lnetcdf   -qopenmp   -mcmodel=medium -shared-intel  

However compilation fails due to errors and conflicts with the gcomlibaries, and being unable to load themplmodules, i.e.

	[FAIL] /home/548/pag548/cylc-run/am3-n640/share/fcm_make_um/preprocess-recon/src/um/src/control/misc/um_abort_mod.F90(54): error #7002: Error in opening the compiled module file.  Check INCLUDE paths.   [GC__BUILDCONST]
	[FAIL] USE gc__buildconst, ONLY: gc__forterrunit
	[FAIL] ----^
	[FAIL] /home/548/pag548/cylc-run/am3-n640/share/fcm_make_um/preprocess-recon/src/um/src/control/misc/um_abort_mod.F90(54): error #6580: Name in only-list does not exist or is not accessible.   [GC__FORTERRUNIT]
	[FAIL] USE gc__buildconst, ONLY: gc__forterrunit
	[FAIL] --------------------------^
	[FAIL] /home/548/pag548/cylc-run/am3-n640/share/fcm_make_um/preprocess-recon/src/um/src/control/misc/um_abort_mod.F90(96): error #6406: Conflicting attributes or multiple declaration of name.   [GC__FORTERRUNIT]
	[FAIL]   WRITE(gc__forterrunit, "(A)")                                                &
	[FAIL] --------^
	[FAIL] /home/548/pag548/cylc-run/am3-n640/share/fcm_make_um/preprocess-recon/src/um/src/control/misc/um_abort_mod.F90(96): warning #6187: Fortran 2008 requires an INTEGER data type in this context.   [GC__FORTERRUNIT]
	[FAIL]   WRITE(gc__forterrunit, "(A)")                                                &
	[FAIL] --------^
	[FAIL] compilation aborted for /home/548/pag548/cylc-run/am3-n640/share/fcm_make_um/preprocess-recon/src/um/src/control/misc/um_abort_mod.F90 (code 1)
	[FAIL] compile    0.3 ! um_abort_mod.o       <- um/src/control/misc/um_abort_mod.F90
	[FAIL] mpif90 -oo/setup_namelist.o -c -I./include -i8 -r8 -mcmodel=medium -std08 -g -traceback -assume nosource_include -O0 -fp-model precise -traceback -qopenmp /home/548/pag548/cylc-run/am3-n640/share/fcm_make_um/preprocess-recon/src/um/src/control/top_level/setup_nml_type.F90 # rc=1
	[FAIL] /home/548/pag548/cylc-run/am3-n640/share/fcm_make_um/preprocess-recon/src/um/src/control/top_level/setup_nml_type.F90(69): error #7002: Error in opening the compiled module file.  Check INCLUDE paths.   [MPL]
	[FAIL] USE mpl, ONLY: mpl_integer, mpl_real, mpl_address_kind, mpl_logical,           &
	[FAIL] ----^
	[FAIL] /home/548/pag548/cylc-run/am3-n640/share/fcm_make_um/preprocess-recon/src/um/src/control/top_level/setup_nml_type.F90(86): error #6683: A kind type parameter must be a compile-time constant.   [MPL_ADDRESS_KIND]
	[FAIL] INTEGER (KIND=mpl_address_kind) :: offsets(0:no_of_types-1)
	[FAIL] --------------^
	[FAIL] /home/548/pag548/cylc-run/am3-n640/share/fcm_make_um/preprocess-recon/src/um/src/control/top_level/setup_nml_type.F90(87): error #6683: A kind type parameter must be a compile-time constant.   [MPL_ADDRESS_KIND]
	[FAIL] INTEGER (KIND=mpl_address_kind) :: extent,extent2, extent3
	[FAIL] --------------^
	[FAIL] /home/548/pag548/cylc-run/am3-n640/share/fcm_make_um/preprocess-recon/src/um/src/control/top_level/setup_nml_type.F90(88): error #6683: A kind type parameter must be a compile-time constant.   [MPL_ADDRESS_KIND]
	[FAIL] INTEGER (KIND=mpl_address_kind) :: lb
	[FAIL] --------------^
	[FAIL] /home/548/pag548/cylc-run/am3-n640/share/fcm_make_um/preprocess-recon/src/um/src/control/top_level/setup_nml_type.F90(114): error #6404: This name does not have a type, and must have an explicit type.   [MPL_INTEGER]
	[FAIL]   oldtypes   (counter) = mpl_integer
	[FAIL] -------------------------^
	[FAIL] /home/548/pag548/cylc-run/am3-n640/share/fcm_make_um/preprocess-recon/src/um/src/control/top_level/setup_nml_type.F90(125): error #6404: This name does not have a type, and must have an explicit type.   [MPL_REAL]
	[FAIL]   oldtypes   (counter) = mpl_real
```

Some thoughts.

  1. Once I specify -L/home/548/pag548/cylc-run/vn7.9_nci_gadi/share/nci_gadi_ifort_mpp/build/lib should I then remove -gcom ? I have not loaded the existing gcom libraries during the fcm_make_um task, i.e. I’ve commented out module load gcom/7.9_ompi.4.1.7 from the UMBUILD_RESOURCEenvironment in site/nci_gadi.rc

  2. There is library file located in ~/cylc-run/vn7.9_nci_gadi/share/nci_gadi_ifort_mpp/build/lib/mpl. Should I explicitly add this path to -L for the fcm_make build process?

No, -lgcom is what tells the compiler to link in the gcom libgcom.a library

You’ll only need to add the path containing libgcom.ato the -L flag.

Your issue is the compiler not finding the module files, not the library files. Add a -I flag to the fcflags for each path containing gcom *.mod files, e.g. -I$HOME/cylc-run/vn7.9_nci_gadi/share/nci_gadi_ifort_mpp/build/include. Gcom may put the module files in subdirectories, if so use those instead, I can’t remember precisely.

Thanks Scott.

I’ve now got them linked inside my own UM executable. I added the following to app/fcm_make_um/rose-app.conf

fcflags_overrides=-I/home/548/pag548/cylc-run/vn7.9_nci_gadi/share/nci_gadi_ifort_mpp/build/include

and fcm_make_um did the rest.

There doesn’t seem to be more elegant way to incorporate this statement. FCM itself has entries for fc.include-paths in fcm-make-cfg. e.g. fromhttps://metomi.github.io/fcm/doc/user_guide/make.html

build.prop{fc.include-paths} = /a/path/to/include /more/path/to/include

But I’m not sure how this app is configured to allow this. When I tried editing app/fcm_make_um/file/fcm-make.cfg the suite would overwrite any changes.

1 Like

Yeah working out where things come from in fcm is a headache, there’s a whole nest of different config files it reads from - the file paths being used do get printed at the top of the fcm log if you care to dig through them.

1 Like

Unfortunately we haven’t yet released a spack AM3 build, but when that is done it should make this process a lot more seamless.

In any case, this build (with debug flags) fails in a simple GCOM array sum. When the solver eg_bicgstab runs for the first timestep, it calls gcg_r2darrsumwhich uses mpl_allreduce. This last subroutine generates a NaN in the first index of the output array sum, which subsequently causes an immediate exit before the first timestep is finished. Other UM builds with this case integrate for a month or so.

I’ll ask Martin for the AM3 N96 test case, and I’ll see if my build produces the same error in the test suite.