Tutorial
This section showcases the basic usage of the modelling framework unifhy
(Unified Framework for Hydrology).
Important
To run this tutorial, you need to install the framework and the relevant science components to be used (here Artemis and RFM). To do so, run the following commands:
$ conda install unifhy
$ pip install unifhycontrib-artemis unifhycontrib-rfm
More details can be found in the installation section.
This tutorial is based on v0.1.0 of the framework, you can check the version you have just installed by starting a Python console and executing the following commands:
>>> import unifhy
>>> print(unifhy.__version__)
0.1.0
The input data required to run this tutorial can be found in the data
folder of the repository.
Configuring a Model
The central object in the framework is the Model
, which combines three
science components for the three compartments of the terrestrial water cycle.
The science components currently available to choose from are listed in
the science library section.
Each science component needs to be spatio-temporally discretised through
SpaceDomain
and TimeDomain
instances, to be given data contained in
a DataSet
instance, and to be given parameter and/or constant values.
See Fig. 3 for more details on how all these concepts are articulated together.
Time
TimeDomain
characterises the time dimension of a Component
, i.e.
the period and the temporal resolution.
>>> from datetime import datetime, timedelta
>>> timedomain = unifhy.TimeDomain.from_start_end_step(
... start=datetime(2017, 1, 1, 0, 0, 0),
... end=datetime(2017, 2, 1, 0, 0, 0),
... step=timedelta(hours=1),
... calendar='gregorian'
... )
>>> print(timedomain)
TimeDomain(
time (744,): [2017-01-01 00:00:00, ..., 2017-01-31 23:00:00] gregorian
bounds (744, 2): [[2017-01-01 00:00:00, ..., 2017-02-01 00:00:00]] gregorian
calendar: gregorian
units: seconds since 1970-01-01 00:00:00Z
period: 31 days, 0:00:00
timedelta: 1:00:00
)
Space
All spatial configurations supported by the framework are subclasses of
SpaceDomain
: they characterise the spatial dimensions of a Component
,
i.e. the region and the spatial resolution. The spatial configurations
currently supported can be found in the API Reference’s
Space section. LatLonGrid
is one example.
>>> spacedomain = unifhy.LatLonGrid.from_extent_and_resolution(
... latitude_extent=(51, 55),
... latitude_resolution=0.5,
... longitude_extent=(-2, 1),
... longitude_resolution=0.5
... )
>>> print(spacedomain)
LatLonGrid(
shape {Y, X}: (8, 6)
Y, latitude (8,): [51.25, ..., 54.75] degrees_north
X, longitude (6,): [-1.75, ..., 0.75] degrees_east
Y_bounds (8, 2): [[51.0, ..., 55.0]] degrees_north
X_bounds (6, 2): [[-2.0, ..., 1.0]] degrees_east
)
Three additional properties of SpaceDomain
may require to be set
depending on the component’s requirements: land_sea_mask,
flow_direction, and cell_area:
land_sea_mask may be used by a component to be aware of where there is land and where there is sea, but if set, it is also used to mask the component records;
flow_direction may be used by a component to determine where to move flow downstream, it is namely compulsory if the component is using the
SpaceDomain
’s route method;cell_area may be used by a component to determine the surface area of each spatial element in the component spacedomain. For some spacedomains (e.g.
LatLonGrid
,RotatedLatLonGrid
,BritishNationalGrid
), the cell area is calculated automatically by the framework if not already set.
>>> import cf
>>> spacedomain.land_sea_mask = (
... cf.read('data/land_sea_mask.nc').select_field('land_binary_mask')
... )
>>> print(spacedomain.land_sea_mask)
[[ True True True True True True]
[ True True True True True True]
[ True True True True True True]
[ True True True True True True]
[ True True True True True False]
[ True True True True True False]
[ True True True True False False]
[ True True True False False False]]
>>> spacedomain.flow_direction = (
... cf.read('data/flow_direction.nc').select_field("long_name=flow direction")
... )
Note
To check whether a component needs any of these spatial properties,
each Component
can be queried via its methods requires_land_sea_mask()
,
requires_flow_direction()
, and requires_cell_area()
.
Data
DataSet
must be used to gather all of the data required to run a Component
of Model
. It is a dictionary-like object that stores references to cf.Field
instances.
Warning
Only data fully compliant with the CF conventions (v1.8 or later) can be used.
>>> dataset_surfacelayer = unifhy.DataSet(
... files=['data/surface_downwelling_longwave_flux_in_air.nc',
... 'data/surface_downwelling_shortwave_flux_in_air.nc',
... 'data/specific_humidity.nc',
... 'data/air_temperature.nc',
... 'data/wind_speed.nc',
... 'data/precipitation_flux.nc',
... 'data/leaf_area_index.nc',
... 'data/canopy_height.nc',
... 'data/soil_albedo.nc'],
... name_mapping={'leaf-area index': 'leaf_area_index',
... 'canopy height': 'vegetation_height',
... 'soil albedo': 'surface_albedo'}
... )
>>> print(dataset_surfacelayer)
DataSet{
air_temperature(time(744), latitude(20), longitude(28)) K
leaf_area_index(time(12), latitude(360), longitude(720)) 1
precipitation_flux(time(744), latitude(20), longitude(28)) kg m-2 s-1
specific_humidity(time(744), latitude(20), longitude(28)) kg kg-1
surface_albedo(latitude(360), longitude(720)) 1
surface_downwelling_longwave_flux_in_air(time(744), latitude(20), longitude(28)) W m-2
surface_downwelling_shortwave_flux_in_air(time(744), latitude(20), longitude(28)) W m-2
vegetation_height(latitude(360), longitude(720)) m
wind_speed(time(744), latitude(20), longitude(28)) m s-1
}
>>> dataset_subsurface = unifhy.DataSet(
... files=['data/saturated_hydraulic_conductivity.nc',
... 'data/available_water_storage_capacity.nc',
... 'data/topographic_index.nc'],
... name_mapping={
... 'saturated hydraulic conductivity': 'saturated_hydraulic_conductivity',
... 'available water storage capacity': 'topmodel_saturation_capacity',
... 'topographic index': 'topographic_index'
... }
... )
>>> dataset_openwater = unifhy.DataSet(
... files='data/flow_accumulation.nc',
... name_mapping={
... 'RFM drainage area in cell counts (WFDEI)': 'flow_accumulation'
... }
... )
Science
Each science component is either of a SurfaceLayerComponent
,
SubSurfaceComponent
, or OpenWaterComponent
type, and it embeds the
biophysical processes for the surface, sub-surface, or open water parts
of the water cycle, respectively. All three types share the framework
Component
class as their base class, which makes them readily compatible
with unifhy
. They have the same API, only their interfaces and data
needs differ (i.e their signatures).
>>> import unifhycontrib.artemis
>>> print(unifhycontrib.artemis.SubSurfaceComponent)
SubSurfaceComponent(
category: subsurface
inwards metadata:
canopy_liquid_throughfall_and_snow_melt_flux [kg m-2 s-1]
transpiration_flux_from_root_uptake [kg m-2 s-1]
inputs metadata:
topmodel_saturation_capacity [mm m-1]
saturated_hydraulic_conductivity [m s-1]
topographic_index [1]
requires land sea mask: False
requires flow direction: False
requires cell area: False
constants metadata:
m [1]
rho_lw [kg m-3]
S_top [m]
states metadata:
subsurface_store [m]
outwards metadata:
soil_water_stress_for_transpiration [1]
surface_runoff_flux_delivered_to_rivers [kg m-2 s-1]
net_groundwater_flux_to_rivers [kg m-2 s-1]
)
Note
This information is also available on the online documentation, e.g. see Artemis subsurface component page.
To get a science component instance, one needs to provide a TimeDomain
instance, a SpaceDomain
instance, a DataSet
instance, and parameter
and/or constant values as per indicated in its signature.
>>> component = unifhycontrib.artemis.SubSurfaceComponent(
... saving_directory='outputs',
... timedomain=timedomain,
... spacedomain=spacedomain,
... dataset=dataset_subsurface,
... parameters={},
... records={
... 'surface_runoff_flux_delivered_to_rivers': {timedelta(days=1): ['mean']}
... }
... )
>>> print(component)
SubSurfaceComponent(
category: subsurface
saving directory: outputs
timedomain: period: 31 days, 0:00:00
spacedomain: shape: (Y: 8, X: 6)
records:
surface_runoff_flux_delivered_to_rivers: 1 day, 0:00:00 {'mean'}
)
Note
The variables that can be recorded using the records optional parameter are the component’s outwards, outputs, and states. By default, none are recorded.
Framework
Model
constitutes the core object of the modelling framework. It needs
to be instantiated with three science Component
instances, one for each
of the three parts of the terrestrial water cycle. It combines these three
science components to make a fully functional model for the terrestrial
water cycle.
>>> import unifhycontrib.rfm
>>> model = unifhy.Model(
... identifier='tutorial',
... config_directory='configurations',
... saving_directory='outputs',
... surfacelayer=unifhycontrib.artemis.SurfaceLayerComponent(
... 'outputs', timedomain, spacedomain, dataset_surfacelayer,
... parameters={}
... ),
... subsurface=unifhycontrib.artemis.SubSurfaceComponent(
... 'outputs', timedomain, spacedomain, dataset_subsurface,
... parameters={}
... ),
... openwater=unifhycontrib.rfm.OpenWaterComponent(
... 'outputs', timedomain, spacedomain, dataset_openwater,
... parameters={'c_land': (0.20, 'm s-1'),
... 'cb_land': (0.10, 'm s-1'),
... 'c_river': (0.62, 'm s-1'),
... 'cb_river': (0.15, 'm s-1'),
... 'ret_l': (0.0, '1'),
... 'ret_r': (0.005, '1'),
... 'routing_length': (50000, 'm')},
... records={
... 'outgoing_water_volume_transport_along_river_channel': {
... timedelta(days=1): ['mean']
... }
... }
... )
... )
>>> print(model)
Model(
identifier: tutorial
config directory: configurations
saving directory: outputs
surfacelayer: unifhycontrib.artemis.surfacelayer
subsurface: unifhycontrib.artemis.subsurface
openwater: unifhycontrib.rfm.openwater
)
Warning
While the resolutions of the three components of the Model
can be
different, there are limitations.
In space, the SpaceDomain
s of the three components must:
be in the same coordinate system (e.g. all using
LatLonGrid
);span the same geographical region (i.e. the edges of their domains must overlap);
feature resolutions that are integer multiples of one another (i.e. grid cells of a domain must fully include or fully be included in grid cells of the other domains).
In time, the TimeDomain
s of the three components must:
be in the same calendar (e.g. all in ‘gregorian’);
span the same period (i.e. the start and end of their domains must coincide);
feature resolutions that are integer multiples of one another (i.e. time steps of a domain must fully include or fully be included in time steps of the other domains)
Note
DataComponent
and NullComponent
represent convenient utility to
substitute science components. Any of the three components can be
replaced by these alternatives.
DataComponent
is provided to act the part of a component of the water
cycle by sourcing the component’s outwards transfers from a DataSet
,
e.g. containing outputs of a previous simulation.
NullComponent
is provided to ignore a component of the water cycle
by not processing the component’s inwards transfers received, and by
returning null values for the component’s outwards transfers.
At this stage, the Model
as such is fully configured, and the configuration
is automatically saved as a YAML file in the config_directory and named using the
identifier (e.g. in this tutorial, the file would be at
configurations/tutorial.yml). This file can be reused later to directly obtain
a configured model.
>>> another_model = unifhy.Model.from_yaml('configurations/tutorial.yml')
>>> print(another_model)
Model(
identifier: tutorial
config directory: configurations
saving directory: outputs
surfacelayer: unifhycontrib.artemis.surfacelayer
subsurface: unifhycontrib.artemis.subsurface
openwater: unifhycontrib.rfm.openwater
)
See the files section for an example of such model configuration YAML file.
Simulating with a Model
Spin-Up and Simulate
Once configured, the instance of Model
can be used to start a spin up run
and/or a main simulation run.
>>> model.spin_up(start=datetime(2017, 1, 1, 0, 0, 0),
... end=datetime(2017, 1, 3, 0, 0, 0),
... cycles=2,
... dumping_frequency=timedelta(days=3))
>>> model.simulate(dumping_frequency=timedelta(days=2))
Resume
If the model has terminated prematurely, and dumping_frequency was
set in the spin-up and/or simulate invocations, a series of snapshots
in time have been stored in dump files in the saving_directory of each
Component
and of the Model
. A resume method for Model
allows
for the given run to be resumed to reach completion of the simulation
period. The tag parameter must be used to select which run to resume
(i.e. any spin-up cycle, or the main run), and the at parameter can be
used to select the given snapshot in time to restart from.
>>> model.resume(tag='run', at=datetime(2017, 1, 7, 0, 0, 0))
Note
Further training resources are available at https://github.com/unifhy-org/unifhy-training.