Development

This section details the steps required to develop a component contribution that is compliant with the framework.

You can follow the steps below to develop your component(s):

Create your science component by subclassing a generic framework component

In the modelling framework, the terrestrial water cycle is divided into three components, i.e. SurfaceLayerComponent, SubSurfaceComponent, and OpenWaterComponent (see Fig. 1). These are the three framework components to create subclasses from to start your science component.

Each component features a fixed interface (i.e. a pre-defined set of transfers of information with the other components of the framework): inward information (variables that are given to the component, i.e. “inwards”), and outward information (variables that are computed by the component, i.e. “outwards”), see Fig. 2, and Tab. 1.

component transfers

Fig. 2: Transfers of Information between Components (see Tab. 1 for the numbers meanings).

Tab. 1: Transfers of Information between Components (see Fig. 2 for the numbers context)

#

Name

Unit

1

canopy_liquid_throughfall_and_snow_melt_flux

kg m-2 s-1

2

transpiration_flux_from_root_uptake

kg m-2 s-1

3

soil_water_stress_for_transpiration

1

4

direct_water_evaporation_flux_from_soil

kg m-2 s-1

5

soil_water_stress_for_direct_soil_evaporation

1

6

water_evaporation_flux_from_standing_water

kg m-2 s-1

7

standing_water_area_fraction

1

8

total_water_area_fraction

1

9

water_evaporation_flux_from_open_water

kg m-2 s-1

10

direct_throughfall_flux

kg m-2 s-1

11

surface_runoff_flux_delivered_to_rivers

kg m-2 s-1

12

net_groundwater_flux_to_rivers

kg m-2 s-1

13

open_water_area_fraction

1

14

open_water_surface_height

m

For component contributions to be fully unifhy-compliant, they need to comply with this fixed interface. If your science component contribution is overlapping several components, it requires to be refactored into the relevant number of components.

Contributions must be implemented as Python classes, and more specifically as subclasses of one of the three framework components. This way, the interface for the science component contribution is already set, and the component directly inherits all the functionalities intrinsic to the framework so that, as a contributor, you can focus solely on specifying the data and science elements of your component(s).

Creating your contribution as a Python class can simply be done by subclassing from the relevant framework component.

Example

See an example of a mock surface layer component creation below.

Subclassing from SurfaceLayerComponent class.
import unifhy


class SurfaceLayerComponent(unifhy.component.SurfaceLayerComponent):
    pass

Note

pass is only added here temporarily for this Python example script to remain valid, it will be replaced in the subsequent steps.

Note

By convention, it is asked that you use the same class name as the one you subclassed from, e.g. SurfaceLayerComponent here.

Document your science component using its class docstring

A description of the component (with reference(s) if applicable) alongside a field list containing e.g. name(s) of contributor(s), affiliation(s) of contributor(s), licence, copyright, etc. must be provided. To do so, use your class docstring and follow a reStructuredText syntax.

For the structure of the docstring itself, please start with a short summary line followed by a blank line before providing a more elaborate description as per PEP 257. The field list should be placed last and preceded by a blank line.

Example

See an example of a mock component description below.

Using the component class docstring for description and acknowledgment.
import unifhy


class SurfaceLayerComponent(unifhy.component.SurfaceLayerComponent):
    """Summary line describing the component.

    More elaborate description for the component.

    References:
    `Doe et al. (2020) <https://doi.org/##.####/XXX>`_.

    :Contributors: Jane Doe [1]
    :Affiliations: [1] University
    :Licence: GPL-3.0
    :Copyright: 2021, Jane Doe
    :Codebase: https://github.com/XXX/XXX
    """

Define your science component using its class attributes

The component interface definition is used by the framework to make sure that your component can be coupled with other components to form a model. Indeed, while a standard interface exists for the framework component (see Fig. 2), the science component may fall short to use or produce some of them. So long as the other components it is coupled with are not needing the ones not produced, the framework does not enforce a full compliance with its standard interface. However, transfers not present in the standard interface cannot be used.

The definition of the component interface is specified by assigning sets to the class attributes _inwards and _outwards. In such a set, the items must be part or all of the transfers in the fixed interface for this component (see Fig. 2).

The component definition is used by the framework to make sure that all the information required by the component to run is provided by the user. The definition of a component is made of the information about its inputs, outputs, states, parameters, and constants.

The definition for the component is specified by assigning dictionaries to the class attributes _inputs_info, _outputs_info, _states_info, _parameters_info, and _constants_info. In such a dictionary, the keys correspond to the variable names, and the corresponding value is another dictionary containing the metadata for the variable. The metadata must at least feature the variable units in the SI system and a brief description of the variable in plain english is encouraged.

See an example below:

{
    'variable_name': {
        'units': 'SI unit',
        'description': 'plain english'
    }
}

Details on what each type of variable is, and their potential additional metadata are provided in the sub-sections below.

Inputs

The inputs correspond to the driving data required by the component. They exclude those variables already included in the fixed interface, i.e. inwards (see Fig. 2).

In addition to its units and description, input variables must be given a kind. The inputs can be one of the three following kinds:

  • 'dynamic': data required for each spatial element and for each time step,

  • 'static': data required for each spatial element and constant over time,

  • 'climatologic': data required for each spatial element and for a given frequency within a climatology year.

If no kind is specified, a dynamic kind is assumed.

If the input is of climatologic kind, frequency must also be given and it can take one of the supported values described in Tab. 1 below.

Tab. 2: Supported frequencies for climatologic inputs

climatologic frequency

length of time dimension in data

'seasonal'

Length of 4, corresponding to the meteorological seasons (i.e. Winter [DJF], Spring [MAM], Summer [JJA], Autumn [SON], in this order).

'monthly'

Length of 12, corresponding to the months in the calendar year (i.e. from January to December).

'day_of_year'

Length of 366, corresponding to the days in the calendar year (i.e. from January 1st to December 31st, including value for February 29th).

int

Length according to the integer value (e.g. a value of 6 means 6 climatologic values for the calendar year).

The framework gives the inputs as keyword arguments to the component run method. They are given as arrays of the same shape as the component space domain.

Outputs

The outputs correspond to the variables computed by the component that you want the component users to be able to record. The outputs exclude those variables already included in the fixed interface, i.e. outwards (see Fig. 2). The component users will always be able to record the component outwards and the component states they would like, component outputs offer you the possibility to add more variables to their list of recordable variables.

Any output returned must be a numpy.ndarray of the same shape as the component space domain. The output name used in the returned dictionary can differ from the Python variable pointing to the array.

States

The states correspond to the component variables that need to be given initial values to start the component time integration, and whose values need to be sustained from one time step to the next.

In addition to its units and description, an optional divisions metadata exists, where its expected value is an integer, a string, or a sequence of integers and/or strings:

  • by default its value is 1, indicating the state is a scalar;

  • if its value is an integer greater than 1, it indicates that the state is a vector, and its value is the length of the vector;

  • if its value is a string, the string must correspond to the name of a component constant, whose value will be used in place of the string as divisions;

  • if its value is a sequence, it indicates that the state is an array, and its values are the lengths of the dimensions of the array (in the order in the sequence).

The divisions metadata can be used when considering e.g. different vertical layers in a component. Note that scalar/vector/array refers to the dimension of the state for a given element in the space domain, so a scalar state does not mean that there is only one state value for the whole spatial domain, it only means that there is only one state value for each spatial element in the space domain.

The framework gives the states as keyword arguments to the component initialise, run, and finalise methods. They are given as framework State objects.

Important

Each State object stores the different timesteps of a component state. To retrieve or assign values for a given timestep, the methods get_timestep and set_timestep must be used.

State.get_timestep(timeindex)

Return the state value(s) for the given time index (indices).

Parameters
timeindex: int (or slice)

The temporal index (or indices) of the state to evaluate. The indices must be lower or equal to zero. Index 0 corresponds to the most recent timestep, index -1 corresponds to the second most recent timestep, index -2 corresponds to the third most recent timestep, etc.

Returns
(list of) numpy.ndarray.view

The state value (or list of values) for the requested time index (indices).

State.set_timestep(timeindex, value)

Assign the state value(s) at the given time index (indices).

Parameters
timeindex: int (or slice)

The temporal index (or indices) of the state to assign value(s) for. The indices must be lower or equal to zero. Index 0 corresponds to the most recent timestep, index -1 corresponds to the second most recent timestep, index -2 corresponds to the third most recent timestep, etc.

value: numpy.ndarray

The state value(s) to assign at the given time index (indices).

The retrieved state arrays are of the same size as the component space domain plus (an) additional trailing axis (axes) if the given component state features divisions (i.e. a scalar state will feature no additional axis, a vector state will feature one additional axis of size equal to the vector length, and an array state will feature as many additional axes as the array dimension of sizes equal to the array dimension lengths and in the same order).

Parameters

The parameters are those variables subject to tuning. Note, parameter tuning/calibration is not a functionality offered by the framework.

In addition to its units and description, a valid_range metadata is recommended to be given and take a sequence of two numbers as value to define the extent of the valid range of parameter values. Providing such a range helps the users of your component to determine its parameters values for their specific modelling context.

The framework gives the parameters as keyword arguments to the component initialise, run, finalise methods. They are given as arrays of the same shape as the component space domain.

Constants

The constants are those variable not subject to tuning. Nevertheless, they can be adjusted by the users, e.g. to adjust its precision, or to adapt it to their modelling context.

In addition to its units and description, a default_value metadata is mandatory for each constant. This is to provide a value for the user if they are not interested in providing/adjusting it.

The framework gives the constants as keyword arguments to the component initialise, run, and finalise methods. They are given as scalars.

Extra spatial attributes

In addition, the component definition features three optional spatial attributes _requires_land_sea_mask, _requires_flow_direction and _requires_cell_area. They must be assigned a boolean value (True if required by your component, False if not) – their default value is False. If they are required, the framework will ensure that the user provides the information so that the component can run successfully.

If you need land sea mask information for your computations, set _requires_land_sea_mask to True and access it in your class methods using self.spacedomain.land_sea_mask. This will return an array of the same size as the space domain (see e.g. LatLonGrid.land_sea_mask for details).

If you need flow direction information or want to use the flow routing functionality of the component (accessible through self.spacedomain.route(...)), set _requires_flow_direction to True, and access it in your class methods using self.spacedomain.flow_direction. This will return an array of the same size as the space domain plus an additional trailing axis of size 2 for gridded space domains (see e.g. LatLonGrid.flow_direction for details).

If you need the horizontal cell area of the space domain elements, set _requires_cell_area to True and access it in your class methods using self.spacedomain.cell_area. This will return an array of the same size as the space domain (see e.g. LatLonGrid.cell_area for details).

Example

See a detailed example of a mock component definition below.

Completing the component class definition in the class attributes.
import unifhy


class SurfaceLayerComponent(unifhy.component.SurfaceLayerComponent):
    """component description here"""
    _inwards = {
        'inwards_1',
        'inwards_2',
        'inwards_3'
    }
    _outwards = {
        'outwards_1'
    }
    _inputs_info = {
        'input_1': {
            'kind': 'dynamic',
            'units': 'kg m-2 s-1',
            'description': 'brief input description here'
        },
        'input_2': {
            'kind': 'climatologic',
            'frequency': 'monthly',
            'units': 'kg m-2 s-1',
            'description': 'brief input description here'
        },
        'input_3': {
            'kind': 'static',
            'units': 'm',
            'description': 'brief input description here'
        }
    }
    _outputs_info = {
        'output_1': {
            'units': 'kg m-2 s-1',
            'description': 'brief output description here'
        },
        'output_2': {
            'units': 'kg m-3 s-1',
            'description': 'brief output description here'
        }
    }
    _states_info = {
        'state_1': {
            'units': 'kg m-2',
            'description': 'brief state description here'
        },
        'state_2': {
            'divisions': 4,
            'units': 'kg m-2',
            'description': 'brief state description here'
        }
    }
    _parameters_info = {
        'parameter_1': {
            'units': '1',
            'valid_range': [0, 1],
            'description': 'brief parameter description here'
        }
    }
    _constants_info = {
        'constant_1': {
            'units': '1',
            'default_value': 0.5,
            'description': 'brief constant description here'
        }
    }
    _requires_land_sea_mask = False
    _requires_flow_direction = True
    _requires_cell_area = False

Implement the initialise-run-finalise component class methods

The numerical calculations in your component contribution must be broken down into the three phases initialise, run, and finalise. This means that your Python class must feature three methods named initialise, run, and finalise. Note, initialize and finalize spellings are not supported.

Since the parameters of the three methods initialise, run, and finalise are going to be passed as keyword arguments, the names of the parameters in the signatures of these methods must necessarily be the ones found in the component definition attributes (if renaming is required, this can be done internally to the methods). In turn, this means that the order of the method parameters in the method signatures does not matter. Moreover, your method signatures must all feature a final special method parameter **kwargs to collect all the remaining available arguments given by the framework that the component is not using.

Initialise

The initialise method must define the initial conditions for the component states so that its integration can be started. However, the component user may have already set initial component state values that should not be overwritten. This is why state initial conditions must be set only if the component property initialised_states evaluates as False.

This method can also feature any other action that is required to be done only once before the start of the integration, i.e. pre-processing. In such a situation, a special component attribute shelf exists. It is a dictionary that can be used e.g. to store anything that needs computing once in initialise and to be used repeatedly in run.

It is called at the beginning of a model simulation period.

The possible method parameters in the method signature are the component inputs, states, parameters, and constants.

This method is not expected to return anything.

Run

The run method contains the computations required to integrate from one time step to the next.

It is called iteratively to move through the model simulation period. Between each call, the component states are automatically incremented in time by the framework.

The possible method parameters in the method signature are the component inwards, inputs, states, parameters, and constants.

This method is expected to return a tuple of two dictionaries:

  • the first dictionary must contain the component outward transfers (keys are the outwards names, values are the outwards arrays),

  • the second dictionary must contain the component outputs (keys are the outputs names, values are the outputs arrays).

Note, the second dictionary may be empty if the component does not feature any outputs in its definition.

Finalise

The finalise method should contain any action required to guarantee that the simulation completes “elegantly” and can be restarted after the last simulation time step. It can also feature any other action that is required to be done only once after the end of the integration over the whole simulation period.

It is called once at the end of a model simulation period.

The possible method parameters in the method signature are the component states, parameters, and constants.

This method is not expected to return anything.

Example

See a detailed example of a mock component implementation below.

Implementing the three mandatory methods initialise, run, and finalise.
import unifhy


class SurfaceLayerComponent(unifhy.component.SurfaceLayerComponent):
    """component description here"""

    # component definition here

    def initialise(self, state_1, state_2, parameter_1, constant_1, **kwargs):
        if not self.initialised_states:
            # set here initial condition values for component states
            state_1.set_timestep(-1, 0.)
            state_2.set_timestep(-1, 0.)

    def run(self, inwards_1, inwards_2, inwards_3, input_1, input_2, input_3,
            state_1, state_2, parameter_1, constant_1, **kwargs):

        # compute science using available inwards/inputs/parameters/constants
        routed, outed = self.spacedomain.route(inwards_1 + inwards_2 + inwards_3)

        outwards_1 = (routed + input_1 + state_1.get_timestep(-1)
                      / self.timedelta_in_seconds) * parameter_1

        m = self.current_datetime.month

        output_1 = input_2[m - 1, ...] * constant_1

        output_2 = input_2[m - 1, ...] * (1 - constant_1)
        for i in range(4):
            output_2 += (0.05 * state_2.get_timestep(-1)[..., i]
                         / self.timedelta_in_seconds)
        output_2 /= input_3

        # update component state
        state_1.set_timestep(
            0,
            state_1.get_timestep(-1)
            * (1 - self.timedelta_in_seconds * parameter_1)
        )
        state_2.set_timestep(
            0,
            0.95 * state_2.get_timestep(-1)
        )

        # return outwards and outputs
        return (
            {'outwards_1': outwards_1},
            {'output_1': output_1,
             'output_2': output_2}
        )

    def finalise(self, state_1, state_2, parameter_1, constant_1, **kwargs):
        # cleanly wrap up simulation here
        # to be able to restart from where simulation stopped
        pass

Real component implementations are available in the science library section.

This concludes the preparation of your component contribution, the next step is to package your component(s).