Page tree

MESH - A Community Hydrology - Land Surface Model


Skip to end of metadata
Go to start of metadata

This page describes how to create river classes in MESH_drainage_database.r2c using drainage area (DA) and optional GRU filtering. This exercise uses the RiverClasses_modified.py and RiverClasses_modified.R scripts. Only one or the other should be used, based on preference of using python or R/Rscript. Both scripts contain the same logic and functionality.

Page contents:

Scripts

Two scripts exist for to define river classes (IAK) using drainage area (DA) and optional GRU filters. Only one or the other should be used, based on preference of using python or R/Rscript. Both scripts contain the same logic and functionality.

Language-dependent scripts

RiverClasses_modified.py (python)

Use of RiverClasses_modified.py script requires python with "numpy" installed. Further, it requires the ensim_utils.py script to exist in the same folder. "ensim_utils" is a standalone python script developed for MESH applications that use EnSim Hydrologic (Green Kenue) file formats and cannot be installed using a package manager or via 'pip'.

RiverClasses_modified.R (R/Rscript)

Use of the RiverClasses_modified.R script requires an existing R and Rscript installation. Further, it has the following package prerequisites: "MESHr", "tools", and "stats". Prerequisites and installation instructions for the "MESHr" package are described in the "README" file on the MESHr GitHub page.

Configuration

This code contains the following configuration variables.

VariableDefault ValueDescriptionUnits
workdir'.'The absolute path to the folder that contains the input drainage database file and where the output drainage database file will be saved (default: current folder).--
input_db'MESH_drainage_database.r2c'The name of the input drainage database file including extension.--
output_db(see notes)The name of the output drainage database file (if not specified, a file name is automatically determined based on 'input_db', see below for details).--
IAKmax5The maximum number of river classes to define in the output drainage database file.--
GRUfilterpython: []
R: c()
The GRUs to consider when masking with "GRU override" (see below for details); set the variable equal to "[]" or "None" (python) or "c()" or "NULL" (R/Rscript) to disable "GRU override".--
GRUthreshold70.0The minimum percent of coverage to consider when masking a grid cell to the GRU if "GRU override" is enabled.%

Automatic output file name 'output_db'

In most cases, 'output_db' can be left blank unless a specific name is desired and should be used.

If 'output_db' is not specified, an output drainage database file name is automatically determined by inserting "_IAK" to the end of the 'input_db' file name.

For example, if provided:

input_db = 'MESH_drainage_database.r2c'
output_db = ''

The script will automatically set 'output_db' equal to:

output_db = 'MESH_drainage_database_IAK.r2c'

Overriding variables with command arguments

When running the script in "Command Prompt" in Windows or interactively at the terminal (e.g., Ubuntu), the configuration variables can be assigned in the call that runs the script. Using this approach, there is no need to modify or update these variables in the script itself.

When providing values to the variables in the call that runs the script, variables are assigned as "variable=value". There should be no added characters (including spacing) between the variable, equal sign, and value.

A single option should only be specified once. If an option is listed more than once, only the value of the last-most option is used.

Different options can be combined in a single command. For example:

python RiverClasses_modified.py workdir=absolute_path input_db=new_Shd.r2c GRUfilter=5
In the option "variable=value", "variable" is not case-sensitive. For example, "GRUfilter" and "grufilter" are equivalent and interpreted the same.
OptionDescription
workdir=absolute_path

Defines the absolute path to 'workdir'. This argument can be omitted if the terminal is already inside the desired folder. For example, running "cd" to 'folder_name' and specifying 'workdir=folder_name' at the same time is redundant:

cd folder_name && python RiverClasses_modified.py workdir=folder_name

Instead, "cd" to the folder and run the script without 'workdir' specified:

cd folder_name && python RiverClasses_modified.py

Or, run the script from any location with "workdir=folder_name" specified, where 'folder_name' is an absolute path:

python RiverClasses_modified.py workdir=folder_name
input_db=filename.r2cDefines 'input_db' (name of the input drainage database file including extension).
output_db=filename.r2cDefines 'output_db' (name of the output drainage database file including extension).
iakmax=N

'N' defines 'IAKmax'. For example:

python RiverClasses_modified.py iakmax=5
gruthreshold=D

'D' defines 'GRUthreshold' as a percent value. 'GRUthreshold' is only used if 'GRUfilter' is also specified. For example:

python RiverClasses_modified.py gruthreshold=70.0 grufilter=6
grufilter=N
grufilter=N,I
grufilter=N,I,J
etc..

'N', 'N,I', or 'N,I,J' define the GRU ID(s) to assign to 'GRUfilter'. When passing 'GRUfilter' as a command argument, the "[]" (python) or "c()" (R/Rscript) wrapper is omitted.

A single value can be specified or multiple values can be specified as a comma-separated list. There should be no added characters (including spacing) between the variable, equal sign, and GRU IDs.

To specify a single GRU using the default 'GRUthreshold' value:

python RiverClasses_modified.py grufilter=5

To specify two GRUs using the default 'GRUthreshold' value:

python RiverClasses_modified.py grufilter=5,6

Assigning river class by drainage area (DA)

The script assigns river class based on thresholds determined from percentiles (or quantiles) of the natural logarithm of the drainage areas of active cells in the domain. The probability for the percentile/quantile is the fraction of the river class ID to the total number of river classes in the sample.

R/Rscript
# Determine indexing based on the quantiles of the log of 'DA'.
# Use local variables.
#   - RC:
#       IAKmax if "GRU override" is disabled, otherwise (IAKmax - 1)
#       (IAKmax itself is reserved for the override field.
#   - B: Local variable for 'DA' (natural logarithm transform).
#   - B1: 'B' as a vector (as opposed to an array).
#   - X: Local variable to store the quantile thresholds.
# Based on the Matlab approach used by M. Elshamy for GWF domains.
#   After the natural logarithm transform, -Inf values are converted to
#   NaN so they are recognized and omitted by the 'quantile' function
#   with the 'na.rm = TRUE' option.
if (!is.null(GRUfilter) && !is.null(GRUthreshold)) {
    RC <- (IAKmax - 1)
} else {
    RC <- IAKmax
}
B <- log(shed$basin[ , , da_index])
B[B == -Inf] <- NaN
B1 <- as.vector(B)
X <- integer(RC - 1)
for (i in 1:(RC - 1)) {
    X[i] = quantile(B1, probs = i/RC, na.rm = TRUE)
}

# Apply the background field (filtered by NaN to target active grid
#   cells in the domain).
IAK[!is.nan(B)] <- RC

# Apply IAK values by comparing the transformed values in 'B' to the 'X'
#   thresholdsthe (to (RC - 1), RC is applied as the background field
#   above).
for (i in 1:(RC - 1)) {
    IAK[B > X[i]] = RC - i
}

"GRU override"

"GRU override" will reserve the last IAK value and mask grid cells in the domain where the percent coverage of the GRU(s) defined in 'GRUfilter' are above 'GRUthreshold'. "GRU override" is disabled if 'GRUthreshold' is "None" (python) or "NULL" (R/Rscript), or if 'GRUfilter' is equal to "[]" (python) or "c()" or "NULL" (R/Rscript).

R/Rscript
# "GRU override"
#   Mask values where the specified GRUs in 'GRUfilter' have coverage
#   greater than the threshold defined by 'GRUthreshold' (converted from
#   user-provided percent to fraction; values exist as fractions in the
#   input drainage database file).
# GRUs are always at the end of the file:
#   ('attribute_count' - 'gru_count' + 'i').
# Skip if "GRU override" is disabled.
if (!is.null(GRUfilter) && !is.null(GRUthreshold)) {
    for (i in GRUfilter) {
        IAK[shed$basin[, , (attribute_count - gru_count + i)] >= (GRUthreshold/100.0)] <- IAKmax
    }
}

Examples

"GRU override" is disabled

If provided:

R/Rscript
IAKmax = 5
GRUfilter = c()
GRUthreshold = 70.0
Python
IAKmax = 5
GRUfilter = []
GRUthreshold = 70.0

'GRUfilter' is not assigned any values so "GRU override" is disabled. 5 river classes (from 'IAKmax') will be assigned based on the drainage areas (DA) read from the input drainage database file.

The script will print the following summary to screen:

[1] "Work directory: ."
[1] "Input file: souris_split.r2c"
[1] "Output file: souris_split_IAK.r2c"
[1] "IAKmax: 5"
[1] "GRU override is not active."
[1] TRUE

"GRU override" is enabled

If provided:

R/Rscript
IAKmax = 5
GRUfilter = c(7, 8)
GRUthreshold = 70.0
Python
IAKmax = 5
GRUfilter = [7, 8]
GRUthreshold = 70.0

'GRUfilter' and 'GRUthreshold' are both assigned, so "GRU override" is enabled. Four river classes (1 to ('IAKmax' - 1)) will be assigned based on drainage areas (DA) read from the input drainage database file. The fifth river class (IAK = 5, from 'IAKmax') is assigned in grid cells where GRU 7 or GRU 8 have fractions above 0.700 (70%).

The script will print the following summary to screen:

[1] "Input file: souris_split.r2c"
[1] "Output file: souris_split_IAK.r2c"
[1] "IAKmax: 5"
[1] "GRU override is active."
[1] "GRUthreshold: 70 %"
[1] "GRUfilter 7" "GRUfilter 8"
[1] TRUE

Usage

The RiverClasses_modified.py and RiverClasses_modified.R scripts can be run in Integrated Development Environments (e.g., RStudio for R/Script), using "Command Prompt" in Windows, or interactively using the terminal in Unix-alike systems. In all cases, the necessary prerequisite packages must be installed.

Required packages (dependencies)

Installing packages requires an Internet connection and may take a while, especially if installing multiple packages for the first time.

RiverClasses_modified.py (python)

The RiverClasses_modified.py script requires the "numpy" package. It also requires the ensim_utils.py script to exist alongside the script in the same folder.

Unix-alike systems (Ubuntu)

To install python (2.7) in Ubuntu (including Ubuntu via the Windows Subsystem for Linux) with "numpy" and other useful utilities:

  1. Update the package repositories by running the following command:

    sudo apt-get update
    Consider running "sudo apt-get upgrade" if this is the first time updating the system. Run this command after running the 'update' command.
  2. Install python and other useful python packages by running the following command:

    sudo apt-get install python2.7 python-pip python-numpy python-scipy python-matplotlib python-pandas

RiverClasses_modified.R (R/Rscript)

The RiverClasses_modified.R script requires the "MESHr", "tools", and "stats" packages and their dependencies.

RStudio

To install the "tools" and "stats" packages in RStudio (this process does not apply for "MESHr"):

  1. Click Tools > Install Packages... from the toolbar:

    Or,
    Click "Install" in the "Packages" pane:
  2. List the packages to install in the "Packages" box and click "Install":
  3. Once installed, packages will be listed in the table in the "Packages" pane:

To install "MESHr" in RStudio:

Unix-alike systems (Ubuntu)

In Ubuntu, R and its packages may require other system dependencies, such as: "build-essential" "libudunits2-dev" "cargo" "libmagick++-dev" "libgeos-dev" "libgdal-dev" "libssl-dev" "libxml2-dev".

To install R in Ubuntu (including Ubuntu via the Windows Subsystem for Linux):

  1. Update the package repositories by running the following command:

    sudo apt-get update
    Consider running "sudo apt-get upgrade" if this is the first time updating the system. Run this command after running the 'update' command.
  2. Install the necessary base packages for R, as well as the "r-base" and "r-base-dev" packages, by running the following command:

    sudo apt-get install build-essential libudunits2-dev cargo libmagick++-dev libgeos-dev libgdal-dev libssl-dev libxml2-dev r-base r-base-dev

Steps

RiverClasses_modified.py (python)

Running the RiverClasses_modified.py script requires the ensim_utils.py script to exist alongside the script in the same folder.

The script may return a remark that the 'rpnpy' library cannot be loaded. This is normal on systems without ARMNLIB, and specifically rpnpy, installed.

Unix-alike systems (Ubuntu)

To run the RiverClasses_modified.py script interactively using the terminal in Unix-alike systems:

  1. "cd" to the folder that contains the input drainage database file:
  2. Run the script (passing command arguments as necessary); focus will return to the terminal once finished:

RiverClasses_modified.R (R/Rscript)

If updating 'workdir', replace backslashes '\' with forward slashes '/' when pasting paths copied from Windows.

RStudio

If using RStudio, the working directory can be changed directly using the IDE itself (e.g., Session > Set Working Directory > Choose Directory... from the toolbar or using the "Files" pane. In this case, the 'workdir' variable in the RiverClasses_modified.R script should be left to its default value.

To run the RiverClasses_modified.R script in RStudio:

  1. Open the RiverClasses_modified.R script:
  2. Scroll to and update the configuration variables as necessary:

  3. Run the script; focus will return to the "Console" pane once finished.

"Command Prompt" in Windows

If using "Command Prompt" in Windows, R and the "MESHr", "tools", and "stats" packages and their dependencies must still be installed. This may already be the case if RStudio is installed, and these dependencies were installed in RStudio.

When running "cd" to a folder in "Command Prompt" in Windows, if the location exists on a different drive than currently currently selected drive, type and run "<desired driver letter>:" to switch to the desired drive (where "<desired drive letter>" is replaced with the letter itself.

For example, if "Command Prompt" defaults to a location on "C:" when opened:

After running "cd" to a location on "M:", the terminal still points to the location on "C:":

Type and run "M:" to switch active drives, and the terminal should now point to the desired location:

To run the RiverClasses_modified.R script in the "Command Prompt" in Windows:

  1. Open "Command Prompt" ("cmd.exe"):
  2. "cd" to the folder that contains the input drainage database file:
  3. Run the script (passing command arguments as necessary); focus will return to "Command Prompt" once finished:

Unix-alike systems (Ubuntu)

To run the RiverClasses_modified.R script interactively using the terminal in Unix-alike systems:

  1. "cd" to the folder that contains the input drainage database file:
  2. Run the script (passing command arguments as necessary); focus will return to the terminal once finished:

Command argument examples

Providing 'workdir' ("Command Prompt" in Windows)

When working with paths in R, backslashes '\' must be replaced with forward slashes '/' when pasting paths copied from Windows.
"C:\Program Files\R\R-3.5.1\bin\Rscript.exe" RiverClasses_modified.R workdir=C:/Users/user/path/folder_name

Providing 'workdir' (Ubuntu)

Rscript RiverClasses_modified.R workdir=/home/user/path/folder_name

Providing 'input_db' and 'GRUfilter' (Ubuntu)

When passing 'GRUfilter' as a command argument, the "[]" (python) or "c()" (R/Rscript) wrapper is omitted and the GRU IDs are passed as a comma-separated list. There should be no added characters (including spacing) between the variable, equal sign, and GRU IDs.
Rscript RiverClasses_modified.R input_db=new_Shd.r2c GRUfilter=7,8