Wall Time ignored - I think I have a fix

Many versions of the ACCESS system using rose/cylc on gadi have an issue with the walltime. The Wallclock time in the rose interface expects an ISO8601 Duration and will fail if it doesn’t receive this.

On the other hand, Gadi’s PBS system doesn’t understand this. This has lead to the workaround that the walltime is hardcoded in site/nci_gadi.rc and the Wallclock time setting being ignored. This is confusing.

I have tested adding this macro to the top of the site/nci_gadi.rc file:

{% macro evalclock(iso8601duration) -%}
     {% set hoursminssecs = iso8601duration.split("T")[1] -%}
     {% set hours = hoursminssecs.split("H")[0] | int -%}
     {% set mins = hoursminssecs.split("H")[1].split("M")[0] | int -%}
     {% set secs = hoursminssecs.split("H")[1].split("M")[1] | int -%}
     {{ [hours, mins, secs] | join(":") }}
{%- endmacro -%}

Then, in the section [[ATMOS_RESOURCE]][[[ directives ]]] I’ve added this line:

-l walltime = {{ evalclock(MAIN_CLOCK) }}

If you notice, it will ignore any value that refers to days or larger, but I haven’t seen any job requesting more than one day.

What do you think? Is there a better way? I’m a total newbie with jinja2

Holger

1 Like

Seems like a great idea @holger. I wonder if this (and other macros?) could be put in macro library in a public repo with some unit CI testing using something like GitHub - cfpb/macropolo.

That would help with testing new functionality too.

I just found a bug in this. If you don’t specify minutes, for example PT2H then it breaks. Back to the drawing board.

My new version is more flexible, but very much longer. There must be a better way, but I don’t find a lot of documentation on jinja2, particularly on how to slice and dice strings. Anyway, here’s my new version, for who is interested. (Mind you, you still have to add the -l walltime = {{ evalclock(MAIN_CLOCK) }} in the relevant section:

{% macro evalclock(iso8601duration) -%}
    {% set duration = iso8601duration.split("P")[1] -%}
    {% set totalseconds = 0 -%}
    {% if "T" in duration -%}
        {% set days_and_more, hours_and_less = duration.split("T") -%}
        {% if "H" in hours_and_less -%}
            {% set hours, minutes_and_less = hours_and_less.split("H") -%}
            {% set totalseconds = (hours | int) * 60 * 60 -%}
        {% else -%}
            {% set minutes_and_less = hours_and_less -%}
        {% endif -%}
        {% if "M" in minutes_and_less -%}
            {% set minutes, seconds = minutes_and_less.split("M") -%}
            {% set totalseconds = totalseconds + (( minutes | int ) * 60 ) -%}
        {% else -%}
            {% set seconds = minutes_and_less -%}
        {% endif -%}
        {% if "S" in seconds -%}
            {% set totalseconds = totalseconds + ( seconds.split("S")[0] | int ) -%}
        {% endif -%}
    {% else -%}
        {% set days_and_more = duration -%}
    {% endif -%}
        {% if "W" in days_and_more -%}
        {% set totalseconds = (days_and_more.split("W")[0] | int) * 7 * 24 * 60 * 60 -%}
    {% else -%}
        {% if "Y" in days_and_more -%}
            {% set years, months_and_less = days_and_more.split("Y") -%}
            {% set totalseconds = totalseconds + ((years | int) * 365 * 24 * 60 * 60 ) -%}
        {% else -%}
            {% set months_and_less = days_and_more -%}
        {% endif -%}
        {% if "M" in months_and_less -%}
            {% set monts, days = months_and_less.split("M") -%}
            {% set totalseconds = totalseconds + (( months | int ) * 30 * 24 * 60 * 60 ) -%}
        {% else -%}
            {% set days = months_and_less -%}
        {% endif -%}
        {% if "D" in days -%}
            {% set totalseconds = totalseconds + (( days.split("D")[0] | int ) * 24 * 60 * 60 ) -%}
        {% endif -%}
    {% endif -%}
    {{ totalseconds }}

Just replying to highlight a typo, it should be set months not set monts

If you give an ISO duration to execution time limit then it will get automatically passed to -l walltime as a duration in seconds

1 Like

@Scott It was more complex than that. That was the first thing I tried. But after your comment, I went back and tried one more thing:

I had seen that the section [[ HPC ]][[[ directives ]]] contained a setting

-l walltime = 1:00:00

and when I commented out that setting, the execution time limit was correctly set. No question that is much easier than my convoluted jinja2 macro.

Thank you for letting me check twice.

Cheers
Holger