rdd – tool to launch script functions.

Synopsis

rdd DATA [options]
rdd DATA OBJECTS [options]
rdd DATA OBJECTS TASKS [options]
rdd [options]
rdd --help|--version

Description

rdd launches functions from selected libraries in interpreted languages. It also gathers data from config files in various combinations.
The tool allows to construct scripts with input data and control flow ruled from command line.

rdd-howto(7) contains step-by-step usage description with examples. This manual contains exhaustive description of all implementation details and is more useful after one gets used with essentials.

rdd provides command line interface to say:
read the DATA, take some OBJECTS and launch these TASKS for these objects with given input data.

DATA
is summarised from command line options and plain text configuration files. Data may refer inclusion of another data, thus large data set is possible with mentioning single start entry.
OBJECTS
is the list of names. Each name can store some personal data or properties in addition to generic set. Launch of resulting code is done as separate process for every object in list.
TASKS
is the list of functions from user's libraries to launch. Supported programming languages are Bourne shell and python2. Arbitrary library functions can be called directly from command line.

Short-form arguments

First arguments (from one to three) can be passed in shortened form as the value without corresponding option. This is done to type essential commands as the sentence without symbols like - or =.

The relation of positioned values to actual options is as follows:

DATA
first short argument, the value of rdd_prf_entry option.
OBJECTS
second short argument, the value of rdd_list_entry option.
TASKS
third short argument, the value of rdd_map_entry option.

The illustration of short forms and their equivalents:

rdd DATA
rdd rdd_prf_entry=DATA

rdd DATA OBJECTS
rdd rdd_prf_entry=DATA rdd_list_entry=OBJECTS

rdd DATA OBJECTS TASKS
rdd rdd_prf_entry=DATA rdd_list_entry=OBJECTS rdd_map_entry=TASKS

When several values of the same option are given, the final resulting value is the most right one, disregarding its form, shortened or not.

Listed three options must be initialised somehow for successfull launch of rdd. Everything missed in command line must be present in configuration files and vice versa.

Common option syntax

Many options recieve single entry or a list of entries as the value. Entries are separated with commas.
Pure text entries from the arguments may contain ascii letters, digits and underscore _ symbol.
Paths may contain ascii letters, digits, spaces and set of symbols:
        ~@.+-/_
Object names may contain ascii letters, digits and set of symbols:
        ~@.+-_

Files and directories, when given with non-absolute paths, are prepended with the root of current working copy, stored in RDD_ROOT variable.

Some entries may contain specificator words before them (see rdd_map_libs or func entry syntax inside rdd_map_entry). Specificators can be separated from entries with spaces or :: symbol combination.
Both forms are equivalent. First form is more readable, but can be used only inside configuration files. Second form can be used in command line, as it has non-space separators.

There can be any quantity of spaces betweeen list entries, as well as between an option name and its value inside configuration files.

Data gathering

The first part of rdd's work is to gather data in the right order according to all configuration files and options. That part can be done separately with datardd(1).

Configuration file has the following format:

[section]                            

# comment on any line
option = value
option_with_list = list_value, another_list_value
...

In case of the same variable with different values the value is taken from the source with highest priority. When data is read from sources with the same priority, first read sections have more priority over further ones.

First sections are taken from rdd_prf_id in command line or rdd.def file. The next priority belongs to sections from rdd_prf_entry in command line or rdd.def file. Further sections are added and parsed with the addition of rdd_prf_id inside met data sections in configuration files. Several generated values are added to the resulting data set as described in Generated options.

Sources of data in the priorities order:

-
command line options
-
file rdd.def in the root of working directory
-
file rdd_atom_path/rdd_atom_id/atom.conf
-
files *.conf inside rdd.conf.d/ directory in the root of working directory
-
only if there is no rdd.conf.d/ in the working directory, and RCR_ROOT is set to a valid directory, files *.conf inside $RCR_ROOT/../etc/rdd and $RCR_ROOT/../../etc/rdd directories.

Data dump

The second part of rdd's work is handling dumps. Dump is the script in the used programming language, containing gathered data and library files. For each used language the dump is created. That part can be done separately with dumprdd(1) (prints the dump to stdout) and droprdd(1) (stores dump in the file).

For each rdd_atom_id the object descriptor is created as the following:

rdd_atom_id_word1_word2_...wordN

word1 to wordN are the content of rdd_prf_entry, given in the command line. It's the shortest key to identificate resulted data with current unchanged configuration files.

Then the directory $RDD_ROOT/var/dump/object_descriptor/ is created. All the files constructed by rdd are stored here. That dir is referenced as object_dir below.

Each supported language has the extension for script names of that language:

.sh
for shell
.py
for python

The file object_dir/dump.ext is language dump, constructed by rdd. It contains language variables with rdd data and inclusion of the library files (or library code in case of inline entries). Library files should not contain any execution commands at the moment.

Function calls

The third part of rdd's work is task launch. Task launch is a simple function call inside constructed script of chosen language.

Script object_dir/phaseseekdump is constructed for each language to determine whether required functions belong to it. The script is always in shell.

Script object_dir/calldump contains final launch of all functions in all languages. The script itself is always in shell.

Options

rdd_prf_entry=word1[,word2,...]
Option to start data reading. It contains the name of data section or multiple section names inside configuration files. These data sections are also called data words. Data word is essential unit of data gathering, done by rdd.

The content of configuration files under given sections is read and variables from there are added to the data collection. The format and possible names of configuration files are described in manual section Data gathering.

rdd_prf_id=word1[,word2,...]
Option to add new section or sections in addition to already read ones. The format of the values is the same as in rdd_prf_entry.

When some new words are given to the launch, they are added to the summarised list of needed sections and data reading continues with the content of the new added ones. The addition is done with rdd_prf_id in command line or in config file, when this option is met within already loaded section. Thus any section can contain some data of its own and include data from other sections.

When the same option is assigned in several data sections, the value is taken from the first met section. Sections are read from right to left in the list, so the most right one has the highest priority.

This allows to combine data sets automatically with dependence on the order of the asked sections.

rdd_list_entry=object1[,object2,...]
Option with the name of object or multiple object names, used for task launch.

Object name can be a string or a file inside rdd_list_path directory. If a file with given name exists, its content is read as the list of object names, each name on separate line. Empty lines and comments (starting with # symbol) are ignored. Names can be nested, so lists can contain names of another lists.
Overall list is the summarised content of all given lists and separate names, in the order from left to right.
String object names may contain the same symbols, as file paths, excluding space symbol and slash.

rdd_list_path=dir
Option with the path to the directory with object lists. Path can be absolute or relative, in the second case they are prepended with root of working directory.

rdd_map_entry=func1[,func2,...]
Option with the name of function or multiple function names used in the general launch for each object.

Single entry func for the whole rdd_map_* option series has the following format:
[lang ]funcname
[lang::]funcname


The language, which a function belongs to, is determined automatically by rdd. Explicit word lang redefines the language accordance of a function.

rdd_map_libs=[inline ]lang filelib1[, [inline ]lang filelib2, ...]
Option contains list of files with implementation of launched functions. Library files are included in the script, built by rdd, in the order from left to right before the function calls.

Each file must be marked with the language name lang it is written in. Unlike all other options from map series, language specificator is mandatory.

The supported languages are:
POSIX shell
Python2


When file is marked with modificator inline, its entire content is inserted into script instead of inclusion construction in the language of the file. This can be used for some code, which is important to be visible at the first sight inside resulting script for good readability and clearance.

rdd_map_lang=lang1[,lang2,...]
Option with an explicit list of programming languages, which the launched functions are written in.

If this option is not set, used languages are defined from rdd_map_libs.

rdd_map_group=func1[,func2,...]
Option with the name of function or multiple function names used once per launch session.

Functions from group are launched once for each rdd invocation, disregarding quantity of objects. For example, these functions can be used to print summarised info for the whole session, like date, environment or so.

rdd_map_autopre=func1[,func2,...]
Option with the name of function or multiple function names used in the general launch for each object before the rdd_map_entry. With switch of rdd_map_entry functions from this option are still launching.

The option can be used to keep some generic methods which will be always present with any command line argument or data set. For example, such methods can do logging or user interaction, while rdd_map_entry content varies between different data sets or user choice.

rdd_map_autopost=func1[,func2,...]
Option with the name of function or multiple function names used in the general launch for each object after rdd_map_entry. Everything described for rdd_map_autopre applies here as well.

rdd_atom_path=atomdir
Option with the directory, containing personal object data.

For each object the file atomdir/objectname/atom.conf is checked. If it is present, its data is loaded into the launch.

rdd_lang_shell=binsh
Option with the interpretator for shell language. When given as non-absolute path, file binsh is searched according to PATH.

The default value is sh.

rdd_lang_python=binpython
Option with the interpretator for python language. When given as non-absolute path, file binpython is searched according to PATH.

The default value is python.

rdd_env_filter=string1[,string2,...]
List of strings used as templates to get variable names, which will be exported from config files to dump file in programming language.
Template is constructed as logical addition of given strings:

string1* OR string2* ... OR stringN*

Asterisk at the end of each string is according meta-symbol of regular expressions, matching any substring.
When value of option is empty, all data is exported to language dump without filtering.
The default value is empty.

rdd_log_num_terminal=integernumber
The number of file descriptor for a log file, if one of the rdd_tune_stdout_redirect or rdd_tune_stderr_redirect is turned on.

The value must be integer number between 3 and 9. If it is out of the range, the value is assigned to 3. The default value is 3.

This option is always available in the language function as variable, whether it is set explicitly in config files or not.

rdd_tune_atom_personal=1|0
Whether object should contain personal directory inside rdd_atom_path to be considered as valid. Any value besides '1' is considered as '0'. Default value is 0.

When set to 1 and object does not have personal directory, the launch with that object is skipped.

rdd_tune_stdout_redirect=1|0
Whether the standart output (stdout) of launched code should be redirected to the file. Any value besides '1' is considered as '0'. Default value is 0.

When set to 1, stdout of script is redirected to the file, stored in rdd_log_stdout.

Output to terminal, visible to user, is still available with rdd_log_num_terminal.

rdd_tune_stderr_redirect=1|0
Whether the standart error output (stderr) of launched code should be redirected to the file. Any value besides '1' is considered as '0'. Default value is 0.

When set to 1, stderr of script is redirected to the file, stored in rdd_log_stdout.

rdd_tune_phase_buffer_off=1|0
Whether bufferisation of function language definition should be turned off. Any value besides '1' is considered as '0'. Default value is 0 (the bufferisation is turned on).

This option influences on the behaviour of inner algorithm, which checks the existence of methods from command line. When the option is turned to 1, each method check will require a separate execution of outer script. With default value outer script is launched for the biggest possible group of methods at once.

--version
Print version and exit with 0 code.

--help
Print short help description and exit with 0 code.

Generated options

rdd generates options listed below during the work. The variables are absent in configs, but stored in data dump. Assigning these variables in config file will take no effect.

rdd_prf_all="word1 word2 ..."
All data words (sections from configs), loaded in current set. The most right word was read first, word1 the last.

In case some data variable is stored at once in two sections, present in rdd_prf_all, resulting value for that variable is taken from the most right section.

rdd_atom_id=objectname
The name of the current object. When rdd_list_entry is not set, the value is repo.

rdd_atom_dumpdir=$RDD_ROOT/var/dump/object_descriptor/
Directory to store personal atom's data. See section Data dump for information on object_descriptor format.

rdd_log_stdout=$RDD_ROOT/var/dump/object_descriptor/log
File for output redirection, when it is turned on. See section Data dump for information on object_descriptor format.

rdd_exit_status=EXITCODE
Option contains the code, returned by terminated subshell with the function launch.

Limits

Maximum length of generic option value is 1023.

Maximum length of object name is 127.

Limit on the quantity of parent directories, while searching for relative config location, is 200.

Maximum allowed number of files inside all configuration directories is 4096.

Maximum length of single string in the script code, marked as inlined, is 500.

Maximum amount of command line arguments is 1000.

Maximum length of a single command line argument is 500.

Maximum quantity of data sections, used at single load, is 65536.

Maximum quantity of list files, used at single load, is 4096.

Environment

RDD_ROOT contains the root of working copy. The variable is set up by rdd and available for target code. It is calculated as following, in the priority order:
-
The dir in the path to current dir, where rdd.def is stored.
-
The dir in the path to current dir, where rdd.conf.d/ is stored.
-
Current directory.

RCR_ROOT contains the dir with executable modules of rcr(1). Used as the relative adress for global config files, when local config files for working directory are absent. See section Data gathering.

RDD_INNER_DEBUG contains the level of inner debug. If set to 1, rdd outputs additional trace information during its work.

See also

rdd-howto(7), datardd(1), dumprdd(1), droprdd(1), lsrdd(1), rootrdd(1), make(1)