Tutorial - Running a time series simulation for a single grid cell
This document contains a basic tutorial and training exercise for getting started using the VIC model. It assumes that you have a working knowledge of Linux/Unix, and that you have access to a Purdue Linux cluster.
- If you are a member of PHIG then you can request access to current cluster resources through Dr. Keith A. Cherkauer.
- If you are a member of another research group with access to cluster resources, you can request access through your advisor.
- If you are enrolled in ABE 65100 Environmental Informatics you should have access to the scholar cluster through the class.
Step 1: Setting up the model for the simulation tutorial
-
Log onto the correct Linux cluster system using your Purdue career account and password.
-
Make a new directory here called "VICtutorial".
-
Change into the new directory.
-
While in the VICtutorial directory download two files:
-
A file with sample VIC model inputs VICsample.tar.gz, and
-
A file with the VIC model source code being used for this tutorial VIC_code_4.1.2.c.tgz
-
Both files can be downloaded to the VICtutorial directory using the wget command from the source: ftp://ftp.ecn.purdue.edu/hydrodat/TutorialFiles, e.g.
wget ftp://ftp.ecn.purdue.edu/hydrodat/TutorialFiles/VICsample.tgz
.
-
-
Untar and uncompress the file using gunzip/tar (
gunzip <VICsample.tgz>; tar --xvf VICsample.tar
ortar -xvzf VICsample.tgz
)
Step 2: Compiling the model
-
Change to the VIC_4.1.2.c sub-directory that is created when you untarred the source code file.
-
List the files in the directory. You will see a long list of files ending in ".c" and a few in ".h". These are all part of the model's source code.
-
Files with the ".c" extension are C source code. Open any of them up, and you will see parts of a large program written in the C programming language. Even if you have not written a C program, there will be many parts of the program that you will recognize if you have programmed in any other language.
-
Files with the ".h" extension are C header files. These contain C statements and definitions that are shared broadly across the larger program, so definitions of functions and constants.
-
-
As C is a compiled language, not an interpreted language, you cannot run any of the ".c" or ".h" files directly, instead you will have to use a compiler to convert the human readable source code into a binary or executable file (think ".exe" on Microsoft Windows). There are many C compilers available, see the scholar cluster documentation for a taste, all of them will compile a basic C program into workable code. Selection of the compiler is most important if working on unique hardware or when you need to meet specific performance criteria (for example, compile a program to work in parallel across multiple CPUs, or need to maximize performance on the Blue Water supercomputer).
-
For this tutorial we will make use of the Gnu C compiler (gcc), which is quite common on Linux and Unix systems, open source and freely available. If you type "module list" at the scholar prompt, you should find that you already have a version of the gcc compiler loaded. If you look at the available modules ("module avail") you will generally see several versions of gcc as well as other compilers (e.g., intel). The default
gcc
compiler is sufficient for this assignment, so simply usemodule load gcc
and do not add additional version specifications. -
Simplest application of the compiler is
gcc source_code.c
. This compiles the file source_code.c into a binary executable file called a.out. You can feed the compiler many source files to compile into a single binary executable. There are also many flag options that modify the performance of the compiler - allow you to change the output file name, include standard libraries, or even compile into a library that can be used by other programs. -
Given the number of
.c
and.h
files in this directory, you are bound to make a mistake when trying to type everything into a single line command. That is where the makefile comes into play. The makefile is a control file for the compiler. It can be as simple as a wrapper around the gcc command, to a much more automated installation application that checks for required libraries and downloads and compiles missing components on its own. -
In this directory you will find a file called "Makefile". If you open it and look at it, you will find reference to the gcc compiler, flags, libraries and to source and object files, Towards the end of the file, are a number of lines that start with keywords ending with ":" - these defines commands that can be called as part of the make process.
-
You will now build the binary executable for the VIC model by typing "make" in the directory with the source code and makefile. This will result in the screen scrolling past with many "gcc" and related commands.
-
Assuming that there were no errors, you will see ".o" files in the directory now for each ".c" file that was in the directory when you started. These are object files, each the compiled binary of the respective file, but they provide only a piece of the larger program. They cannot be run, but can be "linked" together to build the binary executable.
-
The binary executable file should also now be in the directory with the source code. Look for it by typing "ls -l vicNl". The file should exist, and have executable permissions set. It is binary, so you cannot see the contents, but if you type the new command at the system prompt you should see the following:
{prompt} ls -lh vicNl -rwxr-xr-x 1 cherkaue agen 1.1M Apr 8 14:20 vicNl* {prompt} vicNl Usage: vicNl [-v | -o | -g<global_parameter_file>] v: display version information o: display compile-time options settings (set in user_def.h) g: read model parameters from <global_parameter_file>. <global_parameter_file> is a file that contains all needed model parameters as well as model option flags, and the names and locations of all other files. {prompt}
By default, scholar does not appear to include the current directory (e.g., ".") in its list of paths to search when trying to fulfill a command (see if it appears in the list resulting from the command "printenv PATH"). If the current directory is not in the path, then when you type "vicNl" the system will return "Command not found" even when it was correctly compiled. To get around this problem type "./vicNl", which forces the system to run the version of "vicNl" in the current directory.
-
If you got the usage message, then you are ready to get the model simulations started.
-
Before leaving this directory, use the "make clean" command to remove all of the ".o" object files that are no longer needed.
Step 3: Checking out model input files
The VIC model requires a minimum of five types of input files: the global control file, the vegetation library file, the vegetation parameter file, the soil parameter file and the daily weather data files. In addition, there are two optional files: the lake parameter and the snow elevation band files. The best source for additional information, is the UW VIC model documentation site.
-
The soil parameter, vegetation parameter, lake parameter and snow band files all contain information for each model grid cell to be simulated on different lines, referenced by the grid cell id number.
-
The weather data files each contain weather data for one model grid cell, so the total number of weather files will be equal to the total number of grid cells to be simulated.
-
The vegetation library file contains a look up table of vegetation types that apply to the entire model domain.
The sample files provided here allow you to run VIC for one grid cell only, to add more cells, you would add more lines to the soil, vegetation, lake and snow band parameter files. A snow band file has not been included in this sample.
-
If you uncompressed and untarred the file VICsample.tar.gz in the VICtuorial subdirectory, you should now have a new sub-directory called "ACRESample".
-
Within the ACRESample sub-directory there will be two sub-directories: inputs and plots.
-
Change into the inputs sub-directory to find the VIC model input files described below:
Global Control File (global_4.1.2.txt)
Tells the model what to run and where to put it. The following are items that you are likely to change depending on your simulation. For example:
-
Simulation time:
STARTYEAR
,STARTMONTH
,ENDYEAR
,ENDMONTH
-
Output location:
RESULT_DIR
-
Output file name:
OUTFILE
-
Output variable:
OUTVAR
-
A current listing of output variable names is provided in Appendix B.
-
Multi-layer variables (e.g.
OUT_SOIL_MOIST
,OUT_SWE_BAND
,OUT_SOIL_TNODE
) will result in multiple output columns, depending on the number of soil layers, snow bands or thermal nodes, respectively.
-
Vegetation Library File (veglib_lumped.txt)
Contains a 'look-up table' of vegetation resistance values, LAI, albedo, roughness height & displacement height for 14 default vegetation types
-
You should not need to edit this file after initial model set-up; the file may need to be edited for new applications if the vegetation types are not representative of your location. There is no limit to the number or types of vegetation included.
-
Default vegetation classes for this tutorial include:
Class Number Name 1 evergreen needleleaf forest 2 evergreen broadleaf forest 3 mixed forest 4 deciduous needleleaf forest 5 deciduous broadleaf forest 6 needleleaf evergreen shrubland 7 grassland 8 bogs 9 riparian vegetation 10 tundra 11 croplands 12 forest-natural vegetation complex 13 forest-cropland complex 14 cropland-grassland complex
Vegetation parameter file (vegetation.parameter.412.txt)
Describes the number (and fraction) of vegetation types to be run separately and averaged by land surface area to provide VIC model grid output.
-
The file also describes the distribution of roots for each vegetation type for each cell.
-
The file can also include monthly values for Leaf Area Index (LAI) to allow for spatial differences in seasonal plant development.
-
Sample for one grid cell (grid cell id = 1) with two vegetation types:
1 2 4 0.4 0.10 0.10 0.75 0.60 0.50 0.30 .05 .02 .05 .25 1.5 3.0 4.5 5.0 2.5 0.5 .05 .02 7 0.6 0.10 0.10 0.75 0.60 0.50 0.30 .05 .02 .05 .25 1.5 3.0 4.5 5.0 2.5 0.5 .05 .02
-
Explanation:
- Line 1 (repeats for each grid cell):
<cell no> <no. veg types>
- Line 2 (repeats for no. veg types for each cell):
<veg class id> <fraction of grid cell> <depth 1> <root fraction 1> <depth 2> <root fraction 2> <depth 3> <root fraction 3>
- Line 3 (repeats for no. veg types for each cell):
<Jan LAI> <Feb LAI> ....<Dec LAI>
- Line 1 (repeats for each grid cell):
-
Line 3 is only included if
GLOBAL_LAI
is set to TRUE in the global file
Soil parameter file (soil.parameter.txt)
Describes the soil physical characteristics. Most values adjusted during calibration are in this file.
-
Sample:
#RUN GRID LAT LNG INFILT Ds Ds_MAX Ws C EXPT_1 EXPT_2 EXPT_3 Ksat_1 Ksat_2 Ksat_3 PHI_1 PHI_2 PHI_3 MOIST_1 MOIST_2 MOIST_3 ELEV DEPTH_1 DEPTH_2 DEPTH_3 AVG_T DP BUBLE1 BUBLE2 BUBLE3 QUARZ1 QUARZ2 QUARZ3 BULKDN1 BULKDN2 BULKDN3 PARTDN1 PARTDN2 PARTDN3 OFF_GMT WcrFT1 WcrFT2 WcrFT3 WpFT1 WpFT2 WpFT3 Z0_SOIL Z0_SNOW PRCP RESM1 RESM2 RESM3 FS_ACTV JULY_TAVG 1 1 39.6875 -86.3125 0.10 0.05 10.00 0.80 2 13.660 13.660 13.660 184.90 184.90 184.90 -999 -999 -999 25.4 101.7 127.1 231.10 0.100 0.400 0.500 11.29 4 6 6 6 0.220 0.220 0.220 1406.94 1406.94 1406.94 2685 2685 2685 -6 0.534 0.534 0.534 0.372 0.372 0.372 0.030 0.001 923.20 0.0 0.0 0.0 0
-
Explanation:
-
Line 1 is an option header
-
Line 2 defines parameters for each VIC model simulation cell (repeats for each grid cell):
File Position Variable Name Column 1 <run flag>
Column 2 <cell number>
Column 3 <latitude>
Column 4 <longitude>
Column 5 <bi>
Column 6 <ds>
Column 7 <dsmax>
Column 8 <ws>
Column 9 <c>
Column 10-12 <expt1>
<expt2>
<expt3>
Column 13-15 <ksat1>
<ksat2>
<ksat3>
Column 16-18 <not used>
<not used>
<not used>
Column 19-21 <moist1>
<moist2>
<moist3>
Column 22 <elevation>
Column 23-25 <depth1>
<depth2>
<depth3>
Column 26 <soilT>
Column 27 <dp>
Column 28-30 <bub1>
<bub2>
<bub3>
Column 31-33 <qtz1>
<qtz2>
<qtz3>
Column 34-36 <bd1>
<bd2>
<bd3>
Column 37-39 <pd>
<pd>
<pd>
Column 40 <off GMT>
Column 41-43 <cp1>
<cp2>
<cp3>
Column 44-46 <wp1>
<wp2>
<wp3>
Column 47 <rough>
Column 48 <snow_rough>
Column 49 <prec>
Column 50-52 <rm1>
<rm2>
<rm3>
Column 53 <flag>
-
-
Whenever a value is repeated 3 times, it exists for each soil layer (most VIC model applications use a 3 layer model)
- Of initial interest:
- Run flag: 1 to run current cell; 0 to turn cell off
- Baseflow parameters:
ds
,dsmax
,ws
- Runoff parameter:
bi
- Soil layer thicknesses (\(m\)):
depth1
,depth2
,depth3
- Initial soil moisture (\(mm\)):
moist1
,moist2
,moist3
- Bulk density (\(kg/m^3\)):
bd1
,bd2
,bd3
- Vertical conductivity (\(mm/day\)):
ksat1
,ksat2
- Link to the official documentation page.
Daily weather data file (data_40.4375_-86.9375)
File with variables in columns, one timestep per line. File can also be supplied as binary.
-
The following is needed at a minimum, daily. Variables can be in any column:
-
precipitation (\(mm\))
-
maximum daily air temperature (\(^\circ C\))
-
minimum daily air temperature (\(^\circ C\))
-
daily average wind speed (\(m/s\))
-
-
File name specified by:
data_<latitude>_<longitude>
-
The prefix 'data' can be changed in the global control file
-
Latitude/longitude contain 4 decimal places, as specified in the global control file
-
Step 4: Setting up your work space
-
Open a second Linux terminal (working with the second terminal will reduce the number of times you have to change directories).
-
Within the second terminal return to your home drive, e.g. "cd".
-
Then use the "cd" command to change into the scratch drive for the current cluster system:
cd $CLUSTER_SCRATCH
.The environment variables
CLUSTER_SCRATCH
andRCAC_SCRATCH
should be set correctly when you log onto any of the RCAC maintained cluster systems.This is your scratch storage space, it comes with no quota and is designed for the fastest access possible by the cluster computers. Files stored on this disk will be removed after about 2 weeks of no use, and they are NOT backed up. As the name implies, this is "scratch" space not long term storage.
-
While in the Scratch Drive, create a new directory called "MyVicTutorial", which is where you will conduct your simulations and analysis.
-
Change into the new sub-directory, which is empty. Now we will make links back to the source code and setup directories., using the following commands.
# These commands assume that the folder VICtutorial is in your home folder (~/). # Change the folder path, if you put VICtutorial somewhere else. {prompt} ln -s ~/VICtutorial/VIC_4.1.2.c {prompt} ln -s ~/VICtutorial/ACRESample/inputs {prompt} ln -s ~/VICtutorial/ACRESample/plots
Use the
ls -l
to display a long listing of the current directory. If color highlighting is turned on in your terminal, successful links local name will be blue, and the link source path will be a solid dark blue. If the link local name is red, and the source link path is flashing red then the folder or file you linked to does not exist. -
Now we can access the original directories located on your home drive (which is backed up regularly, but slower and more limited storage space), while the contents of the directories is usable from the more efficient scratch space. Note that linked files will appear with a "@" at the end of the file name in the short listing. If you do a long listing, linked files will have an "l" in the first column of the permission settings, and the local names wil be listed first followed by "->" pointing to the real file / directory name.
{prompt} ls -l total 0 lrwxrwxrwx 1 cherkaue student 44 Apr 8 14:47 inputs -> /home/cherkaue/VICtutorial/ACRESample/inputs/ lrwxrwxrwx 1 cherkaue student 43 Apr 8 14:47 plots -> /home/cherkaue/VICtutorial/ACRESample/plots/ lrwxrwxrwx 1 cherkaue student 38 Apr 8 14:47 VIC_4.1.2.c -> /home/cherkaue/VICtutorial/VIC_4.1.2.c/ {prompt} ls inputs inputs@ {prompt} ls inputs/ data_40.4375_-86.9375 lakes.412.txt vegetation.parameter.412.txt global_4.1.2.txt soil.parameter.txt veglib_lumped.txt {prompt}
-
You can see the contents of the linked sub-directory, if you include the "/" at the end of the local name, otherwise you will see the local linked file placeholder (or shortcut).
-
You can change into the linked sub-directory VIC_4.1.2.c using the
cd
command, but if you try to return to the working directory usingcd ..
you will end up in~/VICtutorial
directory, since you followed the link to the actual location of the sub-directory. -
You can return to your work directory at anytime using
cd /scratch/scholar/cherkaue/MyVicTutorial
. -
I suggest that you open two terminals so that you can work in the two required locations without constantly changing directories.
-
Click on the first terminal window and change into the directory "~/VICtutorial", this xterm will stay on your home drive.
-
Click on the second terminal window and change into the simulation directory on your scratch drive. This terminal will stay on the scratch drive.
-
These two terminals will help you keep track of where you are, and what is on the scratch drive (unprotected/not backed up) and on the home drive (protected/regular backups).
Step 5: Running the model
-
In the terminal pointed at your home drive, change into the directory "~/VICtutorial/ACRESample/inputs" and use emacs to open the global control file (global_4.1.2.txt).
-
Search through the global file to find the following control variables:
-
SOIL
, -
VEGPARAM
, -
VEGLIB
, -
FORCING1
-
RESULT_DIR
, and -
IMPLICIT
The entries following the control variables a-d (SOIL, VEGPARAM, VEGLIB and FORCING1) in the global file should be the correct path and filename of the provided input files.
To check that the files are correct:
-
Highlight the path and filename for each key word in emacs by holding the left-button on the mouse while dragging over the entry in the global control file.
-
At the command prompt in the scratch drive terminal type "ls ", then click middle-button on the mouse to paste.
-
If you do not have a three-button mouse, use
<ESC>-<w>
in emacs to copy the text highlighted text, then paste it into the terminal. -
Press "Enter" and if the listing in the global file is correct, it should echo the filename, for example:
{prompt} ls inputs/soil.parameter.txt inputs/soil.parameter.txt
If the file does not exist, you will get an error message. Confirm that you are in a directory with
input
as a subfolder. Also make sure that you spelled the file correctly (case counts).
-
-
The keyword RESULTS_DIR should be set to the directory you created in Step 5.4.
-
Finally, change the setting of
IMPLICIT
in the global control file fromFALSE
toTRUE
. -
When you have completed these steps, save the global control file.
-
In the terminal pointed at your scratch drive change into the directory "/scratch/scholar/cherkaue/MyVicTutorial", and create a new directory called "outputs"
-
Now you can run the model as follows:
{prompt} VIC_4.1.2.c/vicNl -g inputs/global_4.1.2.txt
-
The model will scroll a lot of information to the screen while it runs, but if the model finishes correctly you will some version of the following statements (note that the Water Error should be 0.0000, but there can be variation in the other values):
Total Cumulative Water Error for Grid Cell = 0.0000 Total Cumulative Energy Error for Grid Cell = 3.1413 Total number of fallbacks in Tfoliage: 0 Total number of fallbacks in Tcanopy: 0 Total number of fallbacks in Tsnowsurf: 0 Total number of fallbacks in Tsurf: 0 Total number of fallbacks in soil T profile: 2508563
-
Total Cumulative Water Error should typically be 0.00 mm, make a note if it is not; the energy error should be on the order of 0 to 5 \(W/m^2\).
-
You can ignore any messages about the implicit scheme failing.
Step 6: Evaluating model output
-
The number and contents of VIC model output files are controlled by the last part of the global control file. Open the global file and scroll to the bottom, or search for the keyword N_OUTFILES to get to the appropriate section.
-
For this tutorial, the VIC model is set up to produce 4 output files with the following characteristics:
- Each grid cell will have one (or more) unique output files,
e.g. 'fluxes_
_ ', where and match the coordinates for each simulation cell defined in the soil parameter file. - Each row is a time step, each column is a different variable, in the order specified in the global control file.
- For the fluxes output file there are four output variables:
OUT_EVAP
,OUT_RUNOFF
,OUT_BASEFLOW
andOUT_SOIL_MOIST
.- The first 3-4 columns are
and ( is only reported if output is sub-daily, output from the tutorial is daily, so there will be column). - The next column is
OUT_EVAP
, which contains the total Evapotranspiration for the simulation cell (mm) for each time step. - The next column is
OUT_RUNOFF
, which contains surface runoff / overland flow for the simulation cell (mm) for each time step. - The next column is
OUT_BASEFLOW
, which contains subsurface return flow (baseflow) for the simulation cell (mm) for each time step. - The last three columns include
OUT_SOIL_MOIST
for each of the three VIC model soil layers (surface, middle, bottom). This is soil moisture in mm for the end of each time step.
- The first 3-4 columns are
- Output files have also been created for snow, soil and lake variables.
- Each grid cell will have one (or more) unique output files,
e.g. 'fluxes_
-
The global control file can control some additional options, including
- Values are written to each file at the interval defined by
OUT_STEP
(24 = daily for the tutorial). - Files can be
ASCII
orBINARY
, as defined byBINARY_OUTPUT
(set toFALSE
, which isASCII
output for the tutorial). - A file header can be included using the variable
PRT_HEADER
(set toFALSE
in the tutorial).
- Values are written to each file at the interval defined by
-
Use processing programs/scripts and GMT to create summary graphics. A sample GMT/c-shell script has been provided to create graphs of runoff, baseflow, soil temperature and SWE versus time for the period 1990-1996. The sample script plot.timeseries.sh in the "plots" sub-directory can be run from the same directory as the model using this command:
{prompt} plots/plot.timeseries.sh
This will create a new postscript output file in the plots subdirectory.
On the RCAC cluster computer systems, GMT is not loaded automatically, instead use the
module load gmt
command to import the default version of GMT. The command "module spyder gmt" will show you if there are other versions available. -
Output from the GMT script can be viewed using either the gs or ghostview command:
{prompt} gs plots/BaseScenario.ps {prompt} ghostview plots/BaseScenario.ps &
Both use the ghostscript program to interpret postscript (and PDF) formatted files. The "gs" program is a simple page viewer that will open a letter page sized window and draw a rough draft of the contents. Hitting "Enter" from the gs prompt will move to the next page, when there is a next page available. Typing "quit" at the gs prompt will exit the program. The "gv" program is more user interactive (thus it is typically run in the background), that provides a better rendering of the file contents and the ability to zoom and scroll through the file. It also have more of a GUI so options are easier to find.
Step 7: Running simulations correctly on a clustered computer system
Making use of the Linux cluster systems requires a slightly different mindset from using standard Linux systems. The biggest change is that in most cases you do not actually log onto the compute nodes of the cluster itself, instead you log onto the shared front-end systems. These systems allow you access to the files and resources of the computer cluster, but their role is as a staging location for users to prepare their jobs for submission to the larger cluster. The following steps will introduce the concept more, and walk you through the process of developing a C-shell script to submit your VIC model job to the cluster.
-
When you log into one of the cluster systems (e.g.,
ssh -Y scholar.rcac.purdue.edu
), you will be directed to one of the front end systems. If you have followed the instructions to set your shell to C--shell and install the common .cshrc, then you should have a prompt that looks something like: scholar--fe## or conte--fe##, where the ## is a two--digit number (00, 01, 02, ...). This number indicates the front--end system to which you have logged in.-
Each time you login to a cluster computer system using the generic name (e.g., scholar.rcac.purdue.edu or brown.rcac.purdue.edu), you will be directed to the front--end system with the fewest users / least demand. So even is logging in within minutes of each other you sessions may be on separate systems. Each front--end system is a stand-alone Linux system and will function just like pasture, danpatch, bridge, ob, ganges or any other Linux computer.
-
If you want to login to a specific front--end system, for example system number 02, then include the number in the IP address,
ssh -Y scholar.rcac.purdue.edu
. This will override the system trying to distribute usage and log you directly into the requested machine. In most cases, this is unnecessary; however, if you start a job on a front--end system (not recommended, but there are some reasons for doing this) and need to check its status, you will have to return to the same front--end system to find it's process ID (PID).
-
-
The RCAC monitors usage of the front--end computers and will automatically kill jobs that exceed limits on CPU and memory usage. If your job is killed, you will get an email to your Purdue account with an explanation. The front--end systems are for user interactions, the real power of the system is in the cluster computer nodes which require another step to access. By running large jobs on the front--end systems, you are impairing the ability of other users to stage their own data and you are not using the resource to its full ability.
-
There are two main options for using the cluster nodes:
- Submit a script to the SLURM queue system and wait for it to finish, or
- Request an iterative session through the SLURM queue and interact directly with the compute nodes.
-
These methods require that you have access to a queue on an RCAC cluster system, examples of queues available on scholar include:
-
The scholar queue is the default queue;
-
The long queue for longer running jobs (more limited number of nodes available);
-
The gpu queue for making use of the cluster node GPUs; and
-
The debug queue for short, rapid turn-around access to nodes for debugging.
-
-
The queues available to you may differ based on the cluster computer on which you are running. Specifically, you may have access to queues specifically for your research group, if they have purchased or rented access to the cluster systems. You can always list the queues available to you using the 'slist' command from the terminal prompts.
-
This tutorial assumes you have access to the scholar cluster, if you are instead using a research queue then replace the 'scholar' queue with the correct queue name.
-
To run the VIC model on the scholar queue, you first have to write a submission script. This will take the command typed at the prompt in Step 5 and wrap it in a Bash shell script so that it will run without requiring command line options.
-
Click on the terminal window pointing at your scratch drive folder and use emacs to open a new file called "RunVicModel.sh".
-
Type or copy the following into the blank emacs window:
#!/bin/env sh cd $SLURM_SUBMIT_DIR VIC_4.1.2.c/vicNl -g inputs/global_4.1.2.txt module load gmt plots/plot.timeseries.sh
The first line tells the submission system to interpret the file as a Bourne shell script, the second line changes the working directory to that stored in the environment variable
SLURM_SUBMIT_DIR
. This is set to the directory from which you submit your job, since when the submitted job is started on the cluster node it will start in your home directory (just like when you first login to a system). The next line is the same command you used to run the VIC model in Step 5, and the final two lines should be the commands used used in Step 6 to generate the time series figure. -
Now save the file from emacs.
-
Change the file permissions so that the new script file is executable.
-
Now the script file can be submitted to the SLURM queue system using the "sbatch" command.
{prompt} sbatch -A scholar -t 15 RunVicModel.sh
Where the "-A" flag option selects the queue, in this case the simulations are run from the class account on the scholar cluster so we used the 'scholar' queue (for other options see step 4). The "-t" flag indicates you are setting the "walltime" to 15 minutes (as MM, HH:MM:SS, Days-HH:MM:SS). The walltime tells the queue approximately how long you job will need on the remote compute node. The queue shuffles the start time of jobs to maximize usage of resources, it takes into account your priority (research user queues get priority, then normal queues such as scholar on scholar, finally the standby and debug queues), requested resources (one or many compute nodes, time requirement, disk requirements), and the order of requests in the queue to schedule resources. Once you are running, the walltime is a hard maximum, so if our simulation runs for 15 minutes it will be terminated even if it is not done. Therefore selecting an appropriate walltime is important - too long and the queue will have to clear a lot of resource time so your start may be delayed, too short and your job will not finish. In general, it is better to request too much time, but pay attention to how long your jobs run so you can do better in the future.
-
You can check the status of your submitted job from any front-end machine on the cluster by using the "squeue" command.
{prompt} squeue -u cherkaue scholar.rcac.purdue.edu: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time ----------------------- ----------- -------- ---------------- ------ ----- ------ ------ --------- - --------- 8831406.carter-adm.rca cherkaue scholar RunVicModel.sh -- -- 1 -- 00:15:00 Q --
The "-u" flag option allows you to filter by username rather than seeing the complete list. Seeing the complete list does give you a sense of how busy the queue is today. The response provides the Job ID, username of the submitter, queue that was selected, name of the job (in this case the run script), the requested walltime, and the job's status - in this case "Q" for queued. Other information, currently represented by "----" will be populated once the job actually starts running on a remote node.
-
If you wanted to kill this or any other job you own on the queue, you can use the Job ID with the "scancel" command.
-
When the job is done, it will no longer appear in the SLURM queue, so the "squeue" command will return nothing.
-
There will, however, be a new file in the directory from which you submitted your job: slurm-######.out, where the value "######" will be the same as the Job ID of your submission.
This file contains a log file of messages written to stdout and stderr while the submitted job was running. This file should contain most of what was written to the screen the first time you ran the VIC model at the command prompt in Step 5. So it should have the messages about total energy and water balance errors near the end, and these values should be the same as what you got when you first ran the model.
-
The files in the outputs directory as well as the postscript file in the plots directory will also have newer time stamps, reflecting that they were generated from the submitted simulation, not the original simulation from Step 5.
-
View the updated plot of output variables.
-
When you have confirmed that the simulation ran correctly, you can delete the log files using
rm slurm*.out
.This will remove all log files in the directory that start with ".e" or ".o" and end in a 7 digit Job ID number.
-
If you want to preserve the outputs from the VIC model simulation, not just the plot that you created, then the output files in the local "outputs" directory must be copied back to your home drive. Another good option, especially when the raw outputs from the model are large, is to use the "tar" command to bundle all output files into a single archive file, and copy the tar file to the fortress tape archive system. Information on using fortress can be found at https://www.rcac.purdue.edu/, while instructions specific to using the PHIG group storage space on fortress can be found at Archiving Data on Fortress.
Step 8: Simple tests of the VIC model
Below are a set of "assignments" that you should complete to explore some of the various modes of operation of the VIC model.
Calibration
These are some of the most basic changes a user would make to the model parameters when starting to compare with observations.
-
Change the maximum groundwater flow rate:
-
Open the soil parameter file and change dsmax from 100.00 to 10.00.
-
Create a new output sub-directory (e.g. "lowdsmax"), change the name of the output directory in the global control file and re-run the VIC model.
-
Create graphs of runoff, baseflow and bottom-layer soil moisture versus time for 1 to 2 years, including both the original and modified values on both graphs.
-
-
Change the groundwater flow rate during low moisture conditions:
-
Open the soil parameter file and change dsmax back to 10 and change ds from .001 to .1 and ws from .9 to .75.
-
Create a new output sub-directory, change the name of the output directory in the global control file and re-run the VIC model.
-
Create graphs of runoff, baseflow and bottom-layer soil moisture versus time for 1 to 2 years, including both the original and modified values on both graphs.
-
-
Change the infiltration rate:
-
Open the soil parameter file and change bi from 0.2 to 0.01 and re-run the VIC model.
-
Create graphs of runoff, baseflow and top-layer soil moisture versus time for 1 to 2 years, including both the original and modified values on both graphs.
-
-
Summarize the results: how did runoff and baseflow change in each scenario and why?
Evaluate the Effect of Frozen Soil
-
Turn the frozen soil option off:
-
Open the global control file and set
FROZEN_SOIL FALSE
-
Create a graph of soil moisture in the top layer, runoff and baseflow versus time, including both the original and modified values on the graph.
-
Create a separate graph of average soil temperature for the top layer (
OUT_SOIL_TEMP
) versus time for the original run.
-
-
How and when did representing soil frost impact infiltration and runoff?
Evaluate the Effect of Lakes
-
Turn the "lake" option on, which allows the computation of the water balance from lakes and wetlands:
-
Open the global control file and change
#LAKES ./inputs/lakes.412.txt
toLAKES ./inputs/lakes.412.txt
and#LAKE_PROFILES TRUE
toLAKE_PROFILES TRUE
and re-run -
Create graphs of streamflow (runoff + baseflow) and total soil moisture (sum of layers 1, 2 & 3) versus time, including both the original and modified values on both graphs.
-
Create a separate set of graph of lake depth versus time.
-
-
How does the presence of lakes impact the streamflow (runoff + baseflow)?