Basic Freva configuration#
Because the framework consists of multiple parts, Freva has to be configured
to work with those different parts. This is especially true for the web ui
and the evaluation_system core running on an HPC system. The central
building block of the freva configuration is the evaluation_system.conf
configuration file. This file holds information on the server infrastructure,
that is host names for apache solr and mariadb servers, project specific
information like project name and the directory structure.
This part of the documentation will emphasis on the directory structure, which
is important for the interplay of the evaluation_system core and the web ui.
The content of the initial evaluation_system.conf file looks like the
following:
[evaluation_system]
# Freva - Free Evaluation System Framework configuration file
# This file contains the necessary configuration information
# which is needed to run Freva
#
# Username(s) - comma separated - of the admins of the project
admins=
#
# An informative project name
project_name=freva
# The url the Freva web ui can be accessed with
project_website=
# Main configuration path of the Freva instance default to the etc dir
# of the python environment
root_dir=
#
# The location of the work direcotry for is user specific data
base_dir_location=${root_dir}/freva_output
#
#: Type of directory structure that will be used to maintain state:
#:
#:    local   := <home>/<base_dir>...
#:    central := <base_dir_location>/<username>/<base_dir>...
#:
directory_structure_type=central
# The directory name of the <base_dir> (only used if `directory_structure_type`
# is set to central - defaults to `project_name`
base_dir=${project_name}
# Workload manager configuration
# Workload manager system - currently the following workload manger systems are
# available: local (no workload manager), lsf, moab, oar, pbs, sge, slurm
scheduler_system=local
# The directory where temporary job scripts are created
scheduler_input_dir=/tmp/${scheduler_system}_output
# The output directory where stdout/stderr of the jobs are stored
scheduler_output_dir=${base_dir_location}/share/${scheduler_system}_output
# The directory data where preview data (images etc) for the web ui
# is stored.
preview_path=${base_dir_location}/share/preview
# The number of processes used to convert images created by plugins
# for display in the web ui.
number_of_processes=6
#: Path to the directory where users can add an share project specific data
project_data=${base_dir_location}/data/crawl_my_data
#: database path
#: mySQL settings
db.host=
#: Define access to the solr instance
solr.host=
solr.port=8989
solr.core=files
#shellinabox
#shellmachine=None
#shellport=4200
# Workload manager job configuration
# NOTE: The options are workload manager agnostic - except of the
# extra_options flag: which must be in accordance to the utilized
# workload manager.
[scheduler_options]
#partition=
#memory=256GiB
#queue=gpu
#project=ch1187
#walltime=08:00:00
#cpus=12
##Additional options (specific to the workload manager)
#extra_options=--qos=test, --array=20
#[plugin:animator]
#python_path=${evaluation_system:root_dir}/plugins/animator
#module=animator
The configuration file is written in .ini format.
The main sections are evaluation_system for configuring the behaviour of the
core and scheduler_options for configuring the workload manager options.
Plugins are configured via the plugin:<plugin_name>.
Configuring the evaluation_system section#
The following gives an overview of the most important keys to be configured
in the evaluation_system section:
admins: comma separated list of username(s) that are administering the instance of the Freva. Admins have more rights in the framework, like crawling all user data.
project_name: Name of this specific Freva instance, for example
clex-ceswhich would be the central evaluation system (ces) for the clex project.root_dir: The
root_diris directory that holds additional Freva related configuration. By default this folder is the path prefix of the python distribution where Freva is installed to. If you’re using multiple instances that share a common python (anaconda) environment this directory will distinguish the instances. For example two projects project_a and project_b share a Freva installation which is installed into$HOME/anaconda/frevathen project_b would get its configuration by setting theroot_dirto/path/to/project_bwhere theevaluation_system.conffile would be located in/path/to/project_b/freva/evaluation_system.confthe same applies toproject_adirectory_structure_type: This key specifies where the user output is stored. Two different types are possible: local will store the plugin output in the home directory of the user. central will store the plugin output in a central location with each username as the first child directory. For example
/path/to/project/work/<username>.base_dir_location: This is the path where the user output for the plugins are stored. This key has only an effect if the
directory_structure_typeis set to central.scheduler_system: Plugin jobs can either run interactively or in batch mode, for batch mode a number of different workload manager systems are available. The following workload managers are implemented: lsf, moab, oar, pbs, sge, slurm, local. If local is chosen, the plugin job is directly executed on the login node, similar to an interactive job. This means that load balancing for local jobs is not available.
preview_path: The path to the directory holding the user content, like plots, for display in the web ui.
💡after plugin application, display content of the plugin output will be copied to this directory. The default location of this directory<base_dir_location>/share/previewdb.host: The host name where database and vault are running.
solr.host, solr.core, solr.port: The apache solr connection information.
💡The solr.core variable might be subject to removal in the future. Therefore it is recommended to set this variable to its default value: files.
Configuring the scheduler_options section#
All workload managers are configured via the scheduler_options section. The
following configuration keys are available:
queue: Destination queue for each worker job.
project: Accounting string associated with each worker job.
walltime: Walltime for each worker job.
cpus: Number of cpus to book in each job node.
memory: Amount of memory to request in for each job.
extra_options: Additional workload manager specific options (comma separated).