Somusar/Tefigel[tm]

Reference Guide

Francesco Aliverti-Piuri

Copyright © 2003-2012 Somusar

Updated on: January 31, 2005

      
      
      
      
      
      
      
      
      
      
      
      
      
      

Copyright © 2003-2012 so.mus.ar. s.a.s.
Via Sangallo 30 - 20133 Milan - Italy
All rights reserved.

Unix is a registered trademark in the United States and other countries, licensed exclusively through X/Open Company Limited.

Linux is a registered trademark of Linus Torvalds in the United States and other countries.

Sun, Sun Microsystems, the Sun logo, Solaris, Java, and all Java-based marks are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries.

Symbian and all Symbian-based marks and logos are trademarks of Symbian Software Limited.

Apple and Mac OS are registered trademarks of Apple Computer, Inc. in the United States and other countries.

Intel is a registered trademark of Intel Corporation in the United States and other countries.

PowerPC and CICS are registered trademarks of International Business Machines Corporation in the United States and other countries.

Microsoft, Windows, Visual Basic are either trademarks or registered trademarks of Microsoft Corp. in the United States and/or other countries.

Oracle is a registered trademark, and PL/SQL is a trademark of Oracle Corporation.

SAP and ABAP/4 are registered trademarks of SAP AG in Germany and several other countries.

PostScript is a registered trademark of Adobe Systems Incorporated in the United States and/or other countries.

So.mus.ar, the Somusar logo, Somusar/Software Production Technique, Somusar/Software Production Machine, Somusar/Sisendel, Somusar/Tefigel, Somusar/SoProTech, Somusar/SoProMach, Somusar/Software Entity, Somusar/Software Mold, Somusar/Software Mold Kit, Somusar/Software Mold Building, Somusar/Code Generator Building, Somusar/Generator Building, and Somusar/File Generation Scheme are trademarks of so.mus.ar. s.a.s. in Italy, in the European Union, in the United States of America and other countries.

Other trademarks or service marks referenced herein are property of their respective owners.


Contents
Chapter 1 - Introduction
1.1 - Syntax Notation
Chapter 2 - File Sections
2.1 - Purpose
2.2 - Usage
2.3 - Description
Chapter 3 - Commands and Instructions
3.1 - Purpose
3.2 - Usage
3.3 - Description
Chapter 4 - Markers
4.1 - Purpose
4.2 - Usage
4.3 - Description
Chapter 5 - Comments
5.1 - Purpose
5.2 - Usage
5.3 - Description
Chapter 6 - Special Characters
6.1 - Purpose
6.2 - Usage
6.3 - Description
Chapter 7 - Variables
7.1 - Purpose
7.2 - Usage
7.3 - Description
Chapter 8 - Arithmetic Computation
8.1 - Purpose
8.2 - Usage
8.3 - Description
Chapter 9 - Boolean Computation
9.1 - Purpose
9.2 - Usage
9.3 - Description
Chapter 10 - Control Flow
10.1 - Purpose
10.2 - Usage
10.3 - Description
Chapter 11 - Subroutines and Functions
11.1 - Purpose
11.2 - Usage
11.3 - Description
Chapter 12 - Built-in Functions
12.1 - Purpose
12.2 - Usage
12.3 - Description
12.3.1 - File Handling
12.3.2 - Tag File Processing
12.3.3 - CSV File Processing
12.3.4 - Output Formatting
12.3.5 - Variable Formatting
12.3.6 - String Handling
12.3.7 - Data Group Handling
12.3.8 - List Processing
12.3.9 - Word List Processing
12.3.10 - Record Processing
12.3.11 - Date and Time
12.3.12 - Environment
12.3.13 - Miscellaneous
12.3.14 - Additional Details on make_id
Chapter 13 - Input and Output
13.1 - Purpose
13.2 - Usage
13.3 - Description
Chapter 14 - Packages and Libraries
14.1 - Purpose
14.2 - Usage
14.3 - Description
Chapter 15 - Tag File Processing
15.1 - Purpose
15.2 - Usage
15.3 - Description
Chapter 16 - CSV File Processing
16.1 - Purpose
16.2 - Usage
16.3 - Description
Chapter 17 - Filters
17.1 - Purpose
17.2 - Usage
17.3 - Description
Chapter 18 - Links and Traps
18.1 - Purpose
18.2 - Usage
18.3 - Description
Chapter 19 - Name Spaces
19.1 - Purpose
19.2 - Usage
19.3 - Description
Chapter 20 - Miscellaneous
20.1 - Purpose
20.2 - Usage
20.3 - Description
Chapter 21 - Further Reading

Chapter 1 - Introduction

This document defines and describes the syntax and semantics of the Somusar/Tefigel[tm] language. Each chapter defines purpose and usage of a set of Somusar/Tefigel[tm] constructs, and describes them concisely.

1.1 - Syntax Notation

In the syntax convention used in this document literal words and characters are indicated by text in bold fixed-font type, as in process. Non-terminal symbols are indicated by text between < and > in italic type, as in <this-is-a-symbol> (when defined) or as in <this-is-a-symbol> (when referred to).

Chapter 2 - File Sections

2.1 - Purpose

An input file for Tefigel may contain one or more <file-section>'s of three types: text, Tefigel code, and comment blocks. File sections allow to improve readability of Tefigel files and can be freely defined within each Tefigel file in no particular order. By default the content of a Tefigel file whose name ends in ".tfg" is assumed to start with a <tefigel-section>. The content of a Tefigel file whose name does not end in ".tfg" is assumed to start with a <text-section>.

2.2 - Usage

Define a <text-section>:

   text
   <text-section>
   endtext
Define a <tefigel-section>:
   tefigel
   <tefigel-section>
   endtefigel
Define a <comment-section>:
   <comment-marker>\
   <comment-section>
   <comment-marker>/

2.3 - Description

A <file-section> consists of zero or more <logical-line>'s.

The content of a <text-section> is subject to textual substitution of <variable>'s and in-line function calls via <subprogram-reference>'s. Lines starting with the current <command-marker> are treated as <command>'s. Lines starting with the current <comment-marker> are treated as <comment-line>'s.

The content of a <tefigel-section> is processed as a sequence of <command>'s or subroutine and function calls via <subprogram-reference>'s. Lines starting with any current <instruction-marker>, possibly preceded by blanks and/or tabs, are processed as the corresponding <instruction>. In particular, lines starting with the current <comment-marker>, possibly preceded by blanks and/or tabs, are treated as <comment-line>'s.

The content of a <comment-section>, including its opening and closing delimiters, is treated as a sequence of <comment-line>'s.

Both <text-section>'s and <tefigel-section>'s may contain any number of <comment-section>'s.

Chapter 3 - Commands and Instructions

3.1 - Purpose

An input file for Tefigel consists of a sequence of <logical-line>'s. Commands and instructions steer the file generation process performed by Tefigel on its input(s) to generate its output(s).

3.2 - Usage

   <command-marker> <instruction> <instruction-parameters>
   <command-marker> <subprogram-reference>
   <command-marker> <subprogram-reference>(<parameter-list>)
   <instruction-marker> <instruction-parameters>

3.3 - Description

A <logical-line> consists of one or more physical lines of text; if the last character of a physical line matches the current linebreak <special-function> then the logical line continues on the next physical line.

A Tefigel <command> is a logical text line beginning with one of the following:

A Tefigel <instruction> is a Tefigel keyword in lower case (default) or upper case, depending on the current value of <control-variable> CMD_CASE.

The <instruction-marker> and <command-marker>, collectively referred to as <marker>'s, are characters associated respectively with one specific instruction or with all instructions, and appearing as the first character in an input line. White spaces (blank or tab) appearing on a command line between the <command-marker> and the instruction are ignored.

Within <tefigel-section>'s the <command-marker> may be omitted and leading white spaces (blank or tab) are ignored.

Depending on the <instruction>, a list of <instruction-parameters> may follow the Tefigel keyword.

Chapter 4 - Markers

4.1 - Purpose

Define a first-column marker for Tefigel commands or instructions.

4.2 - Usage

Define a new command or instruction marker:

   mark CMD <new-marker>
   mark <instruction> <new-marker>

Restore the previous command or instruction marker (if any):

   mark CMD
   mark <instruction>

4.3 - Description

The initial value for the CMD (command) marker is the at-sign @.

The initial value for the call (subroutine call) marker is the dollar-sign $.

The initial value for the rem (remark) marker is the sharp-sign #.

The <instruction>'s that can be specified as parameters for mark are the following:

   rem
   call
   set
   globset
   unset
   link
   unlink
   trap
   untrap
   add
   sub
   mul
   div
   trunc
   neg
   eval
   and
   or
   not
A <marker> is the most recent <new-marker> defined for the corresponding instruction. The <new-marker> can be any non-blank character. If no <new-marker> is specified, the previous corresponding marker (if any) is automatically restored.

Chapter 5 - Comments

5.1 - Purpose

Add textual descriptions to Tefigel scripts.

5.2 - Usage

   rem comment text
   <comment-marker> comment text

5.3 - Description

The <comment-marker> is the most recent <new-marker> specified for instruction rem. Its initial value is the sharp-sign #.

All <comment-line>'s in input are ignored by Tefigel.

Chapter 6 - Special Characters

6.1 - Purpose

Associate a special text-processing function with a single character.

6.2 - Usage

Define a new special character for a given special function:

   <special-function> <new-special-character>

Restore the previous special character (if any) for a given special function:

   <special-function>

6.3 - Description

The <special-function> can be one of the following:

The <new-special-character> can be any non-blank character. If no <new-special-character> is specified, the previous corresponding special character (if any) is automatically restored.

Chapter 7 - Variables

7.1 - Purpose

Define placeholders for character strings that can also be used as traditional program variables.

7.2 - Usage

Set a variable in the global name space:

   globset <variable>=<value>
   globset <associative-variable>=<value>
Set a new variable in the current name space, or change value to a previously set variable:
   set <variable>=<value>
   set <associative-variable>=<value>
Unset a variable:
   unset <variable>
   unset <associative-variable>
Access contents of a variable within an input line (both text and command line):
   text text text text text text text text
   text text text <variable> text text text
   text text text text text text text text

7.3 - Description

A <variable> is defined as an identifier represented by a sequence of letters, digits and underscore ('_'). The first character of the identifier must be either a letter or an underscore. Tefigel is case-sensitive with respect to variable identifiers.

A <value> can be any string and requires no delimiters: every character after the assignment operator = up to the end of the logical command line is considered part of the <value> to be stored in the <variable>, including blanks and tabs, which are treated exactly as all other characters. A <value> may contain other variables as well as in-line function calls, so that <variable> is assigned the string of characters resulting from replacing the other variables with their contents, and the in-line function calls with their return values.

An <associative-variable> is defined as the concatenation of two or more identifiers by means of the current dash special character. The identifiers may refer to previously set variables - in which case their current value is used to construct the corresponding part of the identifier of the <associative-variable> - or not refer to any variable - in which case they will literally be used to construct the corresponding part of the identifier of the <associative-variable>.

Depending on its current value, a variable may be used for arithmetic or boolean computation, as explained later in this guide.

Chapter 8 - Arithmetic Computation

8.1 - Purpose

Perform basic arithmetic operations.

8.2 - Usage

Add, subtract, multiply, divide two numbers:

   <arithmetic-instruction> <numerical-variable> <numerical-value>
Compute the negative value of a numerical variable:
   neg <numerical-variable>
Truncate value of a numerical variable to the nearest integer value between that value and 0:
   trunc <numerical-variable>

8.3 - Description

The <arithmetic-instruction> must be one of the following:

   add
   sub
   mul
   div
A <numerical-variable> must be the identifier of a variable whose current value is a string representing a decimal integer or floating-point number.

A <numerical-value> can be either a <numerical-variable>, or a string representing a decimal integer or floating-point number. The value resulting from the arithmetic computation is stored into <numerical-variable>.

Arithmetic computation is always performed by Tefigel in floating-point in decimal base representation.

Chapter 9 - Boolean Computation

9.1 - Purpose

Perform simple boolean (logical) computation.

9.2 - Usage

Evaluate a plain boolean comparison storing the result into a variable:

   eval <target-variable> <boolean-comparison>
Compute boolean and or or operation:
   <boolean-instruction> <target-variable> <boolean-comparison>
Compute boolean not operation:
   not <boolean-variable>

9.3 - Description

The <boolean-instruction> must be either and or or.

A <target-variable> is defined as the identifier of a <variable> whose value will be set to 1 or 0 depending on the boolean result - true or false, respectively - yielded by the boolean comparison or operation.

A <boolean-variable> is defined as the identifier of a <variable> whose current value is a string representing a number, the boolean value of which will be considered false if the number equals 0, true if the number equals 1, or invalid otherwise.

A <boolean-comparison> is defined as follows:

   <comparison-variable><comparison-operator><comparison-value>
   <comparison-variable>
A <comparison-variable> is the identifier of either a <variable> or an <associative-variable>, possibly not set yet.

A <comparison-value> can be either a <variable>, or a string of characters, possibly empty.

The <comparison-operator> must be one of the following characters:

With the exception of comparison operator ~, that always implies a pattern-matching comparison between one string value and a regular expression, the type of comparison performed by Tefigel depends on the value of <comparison-variable> and <comparison-value>: if both values are numerical, then a numerical comparison takes place, otherwise a lexicographical comparison is performed.

A <boolean-comparison> with no <comparison-operator> and no <comparison-value> is assumed to be as follows:

   <comparison-variable>#0

Chapter 10 - Control Flow

10.1 - Purpose

Specify the order in which the processing of input lines must be performed by Tefigel.

10.2 - Usage

Terminate processing:

   exit <exit-code>
Perform a conditional statement:
   if <boolean-comparison>
   <input-block>
   else
   <alternative-input-block>
   endif
Perform a multiblock conditional statement:
   case <case-variable>
   when <case-evaluator>
   <input-block>
   when <case-evaluator>
   <input-block>
   otherwise
   <alternative-input-block>
   endcase
Perform a while loop:
   while <boolean-comparison>
   <input-block>
   endwhile
Perform a for each loop, with automatic <counter-variable>'s and automatic <total-loops-variable>'s:
   for <loop-variable>=<for-each-list>
   <input-block>
   endfor
Perform a numeric loop:
   loop <loop-variable>=<range-definition>
   <input-block>
   endloop
Perform an unconditional transfer of control:
   jump <target-label>
   <input-block>
   label <target-label>
Perform a conditional transfer of control:
   jumpcond <target-label> <boolean-comparison>
   <input-block>
   label <target-label>

10.3 - Description

An <exit-code> is an integer number to be returned to the operating system process that started Tefigel.

An <input-block>, or an <alternative-input-block>, is a sequence of logical input lines, that may contain none, one, or more Tefigel command lines.

A <case-evaluator> is defined as follows:

   <comparison-operator><comparison-value>
A <case-variable> is a <variable> that gets automatically initialized and evaluated by Tefigel while performing the corresponding multiblock conditional statement.

A <for-each-list> is a list of strings separated by the current argdelim <special-function>.

A <counter-variable> is a numeric variable that gets automatically initialized to 0 or 1 and incremented at each iteration of the for loop. There are four such variables: the name of these variables are the same as the corresponding <loop-variable> with the following suffixes:

A <total-loops-variable> is a numeric variable that gets automatically initialized with the for loop. There are two such variables: the name of these variables are the same as the corresponding <loop-variable> with suffixes _loops and _LOOPS respectively.

All <counter-variable>'s and <total-loops-variable>'s get automatically unset after the corresponding endfor instruction.

A <range-definition> is a list of two or three numbers separated by the current argdelim <special-function>. The first number specifies the initialization value for the <loop-variable>; the second number specifies the loop termination value; the third optional number specifies the loop increment or decrement step, which defaults to 1 if not otherwise specified.

A <loop-variable> is a <variable> that gets automatically initialized and updated by Tefigel while performing the corresponding loop.

A <target-label> is a sequence of non-blank characters, that is a sequence of letters, digits and underscore ('_').

When performing a conditional statement, the <input-block> is processed if the value of <boolean-comparison> is true, whereas the <alternative-input-block> is processed if the value of <boolean-comparison> is false. The else instruction and the <alternative-input-block> may be omitted.

When performing a multiblock conditional statement, each <input-block> is processed if the value of the related <case-evaluator> applied to the <case-variable> yields true, whereas the <alternative-input-block>, if present, is processed only if all <case-evaluator>'s evaluate to false.

When performing a loop, depending on the type of loop, the <input-block> is repeatedly processed as long as:

When performing an unconditional transfer of control, on processing the jump instruction file generation control is transferred to the input line immediately following the label instruction, so that the <input-block> is not processed. Both forward and backward transfer of control are allowed.

When performing a conditional transfer of control, on processing the jumpcond instruction, if the <boolean-comparison> yields true, file generation control is transferred to the input line immediately following the label instruction, so that the <input-block> is not processed. Both forward and backward conditional transfer of control are allowed.

Chapter 11 - Subroutines and Functions

11.1 - Purpose

Divide a complex file generation process in a set of simpler functional modules that insulate repetitive or specific tasks and are easier to write, test, and manage.

11.2 - Usage

Transfer control to a subroutine (a file) in the same variable name space:

   process <file-reference>
Transfer control to a subroutine (a file) in the same variable name space, ignoring non-existent subroutine files:
   process_if_readable <file-reference>
Transfer control to a subroutine (a file) in a new variable name space:
   [call] <subprogram-reference>
   [call] <subprogram-reference>(<parameter-list>)
Transfer control to a function (a file) in a new variable name space, using its return value within a logical line:
   <call-key><subprogram-reference>(<parameter-list>)
Assign values of a <parameter-list> to a corresponding set of variables in the new variable name space of current function:
   interface(<new-variable-list>)
Shift the <parameter-list>:
   shift
Set return value of current function:
   retvalue=<value>
Return from a function or subroutine:
   quit
Declare a file to be a <code-file>:
   codefile
Note:

Instruction codefile is obsolete and superseded by <file-section>'s.

   
   

11.3 - Description

A <file-reference> can be either a valid file pathname - absolute or relative - or a <variable> containing a string corresponding to a valid pathname. Relative file pathnames of functions or subroutines are first sought for with respect to the working directory of Tefigel, then using the current <library-path-specifier>, described later in the guide. Tefigel will first attempt to access the file with the given name; in case of failure it will try to access the file adding the extension ".tfg". In other words, extension ".tfg" is optional and may be omitted in <file-reference>'s.

A <subprogram-reference> can be either a <file-reference> or a <built-in-function>. The call instruction may be omitted.

A <parameter-list> is a sequence of strings or <variable>'s, separated by the character currently associated with <special-function> argdelim.

The <call-key> is the character currently associated with <special-function> callkey.

A <new-variable-list> is a sequence of <variable>'s, separated by the character currently associated with <special-function> argdelim.

The <parameter-list> may be omitted, in which case it is also possible to omit the left and right parentheses. Arguments supplied as parameters to a function or subroutine are made available in the new name space of the called function or subroutine as "register" variables identified as REG_0, REG_1, REG_2, etc., according to their sequential position in the <parameter-list>. Unspecified registers are guaranteed to be initialized to the empty string. The number of arguments is available in variable REG_COUNT. The complete list of arguments is also available in variable REG_ALL.

Instruction shift transfers the contents of each REG_n into REG_n - 1, decrements REG_COUNT, and updates REG_ALL accordingly.

Instruction interface automatically initializes each variable in the <new-variable-list> to the value of the register in the corresponding order position. A <new-variable-list> in the form "..." specifies that the function or subroutine has a variable number of parameters, hence no automatic initialization is performed.

Beyond positional association, no restriction is imposed, and no check is performed, by Tefigel on the function or subroutine signature, so that it is possible to supply a variable number of arguments to functions and subroutines. Currently, Tefigel supports up to 32 parameters.

Functions may set their return value by means of instruction retvalue.

Returning from a function or subroutine is automatically performed when the end of the file containing the function or subroutine is reached. Alternatively, instruction quit can be used to explicitly return from the function or subroutine.

A <code-file> is a file where each line is interpreted as a <command>.

Note:

This feature is obsolete and superseded by <tefigel-section>'s.

   
   

Chapter 12 - Built-in Functions

12.1 - Purpose

Perform frequently needed routine tasks.

12.2 - Usage

Built-in functions are used in the same way as user-defined functions are used:

   [call] <built-in-function>(<parameter-list>)
   <call-key><built-in-function>(<parameter-list>)

12.3 - Description

A <built-in-function> is one of the subroutines and functions listed below.

Built-in functions have a higher precedence in comparison with file functions or subroutines, so that it is not possible to override a built-in function with a file function or subroutine carrying the same name.

By default built-in functions carry the lower-case names listed below, unless <control-variable> CMD_CASE is switched to upper-case mode, in which case their names are also switched to upper-case.

12.3.1 - File Handling

12.3.2 - Tag File Processing

12.3.3 - CSV File Processing

12.3.4 - Output Formatting

12.3.5 - Variable Formatting

12.3.6 - String Handling

12.3.7 - Data Group Handling

12.3.8 - List Processing

12.3.9 - Word List Processing

12.3.10 - Record Processing

12.3.11 - Date and Time

12.3.12 - Environment

12.3.13 - Miscellaneous

12.3.14 - Additional Details on make_id

A <base-identifier> is the identifier of a variable, defined as a string of letters, decimal digits and underscores ('_'), as in this_is_a_Valid_BASE_id. An <id-pattern> is one of the strings listed below, providing a mnemonic identifier pattern specifying the desired format for the conversion of a given <base-identifier>. The following list shows the identifier resulting by applying an <id-pattern> to this_is_a_Valid_BASE_id.

Chapter 13 - Input and Output

13.1 - Purpose

Manage input files, that provide text and commands for Tefigel, and output files, where the result of Tefigel's processing is stored.

13.2 - Usage

Change the source of input text and commands: the same instructions used to divide a complex file generation task into a set of more manageable, smaller tasks - namely process, process_if_readable, and call, described in the previous chapter - can also be used to perform this function, as Tefigel allows text and commands to be freely intermixed in any text file.

Attach, without any processing, the contents of a file to the current output file:

   attach <file-reference>
Write output to a file, creating it if it does not exist, or overwriting it if it already exists:
   output <file-reference>
Write output to a file, creating it if it does not exist, or extending it from its current end if it already exists:
   append <file-reference>
Write a line of text to the current output file:
   echo text
Write a line of text to the current diagnostic file (standard error):
   msg text

13.3 - Description

The distinction between file processing and subroutine or function calling in Tefigel is purely theoretical; in practice, calling a subroutine or a function implies the processing of the file that implements the called subroutine or function.

Chapter 14 - Packages and Libraries

14.1 - Purpose

Divide logically related subroutines and functions into packages, hierarchically grouped in collections called libraries.

14.2 - Usage

Define a logical root, called library path, for a set of function and subroutine packages, to be used as the starting point within the file-system when searching for files, functions, or subroutines, to be processed or called.

   library <library-path-specifier>
Discard current definition of library path:
   library

14.3 - Description

A <library-path-specifier> can be either a valid directory pathname - absolute or relative - or a <variable> containing a string corresponding to a valid directory pathname.

File pathnames supplied as parameters to call and process instructions are first used literally by Tefigel to attempt completion of the relevant instruction; should this attempt fail, then the current library path - if previously defined by means of a library instruction - is used as a prefix to attempt the requested calling or processing of the supplied file pathname.

Tefigel allows to define only one library path at a time, to ensure that called or processed files can be uniquely identified.

Chapter 15 - Tag File Processing

15.1 - Purpose

Associate Tefigel processors (subroutines) with tag file contents and process a tag file (typically, XML or HTML files).

15.2 - Usage

Process a tag file applying corresponding user-defined Tefigel scripts from a given directory:

   [call] tag_file_process(<tag-file>,<subroutine-dir>,<case-sensitive>)

15.3 - Description

The built-in subroutine tag_file_process parses the specified <tag-file> and processes its contents through user-defined Tefigel scripts contained in directory <subroutine-dir>. Both <tag-file> and <subroutine-dir> are specified as <file-reference>'s. The optional <case-sensitive> flag, which defaults to 1, can be set to 0 for case-insensitive tag identifiers, such as HTML tags.

When processing <tag-file>, Tefigel calls subroutines from <subroutine-dir> according to the scheme described below. Only available subroutines are called: it is not required to supply all subroutines described below. Missing subroutines will result in Tefigel silently ignoring them.

  1. On entering and exiting the processing of the tag file, Tefigel calls subroutines tag_tree.in and tag_tree.out respectively;

  2. On entering and exiting the processing of each tag node, Tefigel calls subroutines tag_node.in and tag_node.out respectively;

  3. On entering and exiting the processing of a tag node (element) identified by tagId, Tefigel calls subroutines tagId.in and tagId.out respectively;

  4. Input text within one element tagId is processed through subroutines tag_node.tval and tagId.tval;

  5. XML CDATA sections are processed through subroutine tag_cdata;

  6. Comment tags are processed through subroutine tag_comment.

Calls to subroutines 2-4 above are issued only if the current tag matches the tag path search criteria specified by means of variable TARGET_TAG_PATH, which should be set to a regular expression according to the tag nodes of interest. The following variables are automatically set by Tefigel when executing a tag_file_process request:
  1. TAG_MACRO_LIB_PATH is set to the path of <subroutine-dir>;

  2. TAG_PATH is set to the current tag nesting path;

  3. TAG_CONTENTS is set to the source contents of the current tag;

  4. TAG_ID is set to the identifier of the current tag;

  5. Value of each tag attribute tagAttr is assigned to a corresponding variable tp_tagAttr;

  6. Input text within one element is assigned to variable TAG_TEXT;

  7. CDATA text within XML CDATA sections is assigned to variable TAG_CDATA_TEXT;

  8. Each comment tag is assigned to variable TAG_COMMENT_TEXT.

Variables 2-6 in the above list change dynamically with each tag node being processed. All punctuation marks in tag identifiers are converted to underscore _.

Chapter 16 - CSV File Processing

16.1 - Purpose

Process a comma-separated value (CSV) file (typically, a file containing row-column data or metadata) through a Tefigel processor (subroutine).

16.2 - Usage

Process a CSV file through a user-defined CSV subroutine (a Tefigel script):

   [call] csv_file_process(<CSV-file>,<CSV-subroutine>,<value-separator>)

16.3 - Description

The built-in subroutine csv_file_process reads the <CSV-file> and processes each line of its contents through the user-defined Tefigel script <CSV-subroutine>. Both <CSV-file> and <CSV-subroutine> are specified as <file-reference>'s. Parameter <value-separator> specifies the character separating each value in each line of <CSV-file>.

When processing <CSV-file>, Tefigel calls for each input line the specified <CSV-subroutine> passing the input line as a <parameter-list>, so that each value from the input line appears as a distinct input argument to the <CSV-subroutine>. Tefigel automatically sets the <special-function> argdelim to <value-separator>.

Chapter 17 - Filters

17.1 - Purpose

Associate Tefigel processors (subroutines) with text lines matching regular expressions.

17.2 - Usage

Activate a filter for a given regular expression:

   filter <file-reference> <regular-expression>
Deactivate filter previously associated with a regular expression:
   filter <file-reference>

17.3 - Description

When an input line processed by Tefigel matches the <regular-expression> associated with a filter, a call to the subroutine specified by means of <file-reference> is automatically issued by Tefigel, providing as argument the whole text line via REG_0.

Chapter 18 - Links and Traps

18.1 - Purpose

Associate Tefigel processors (subroutines) with input character strings resembling identifiers.

18.2 - Usage

Associate a parametrical function call with a character string:

   link <link>=<subprogram-reference>
Associate a non-parametrical function call with a character string:
   trap <trap>=<subprogram-reference>
Disassociate function call from character string:
   unlink <link>
   untrap <trap>
Activate implicit call to a parametrical function from an input line (both text and command line):
   text text text text text text text text
   text text text <link> text text text
   text text text text text text text text
   text text text <link>(par1, par2, ...) text text text
   text text text text text text text text
Activate implicit call to a non-parametrical function from an input line (both text and command line):
   text text text text text text text text
   text text <trap> text text text
   text text text text text text text text
   <trap>
   text text text text text text text text

18.3 - Description

A <link> is a <variable> associated with a <file-reference>. On encountering a <link> in its input, Tefigel calls the associated <file-reference> providing the given arguments, if any, and replaces in output the <link> call with the string returned by the called function, if any.

A <trap> is a <variable> associated with a <file-reference>. On encountering a <trap> in its input, Tefigel calls the associated <file-reference> without arguments, and replaces in output the <link> call with the string returned by the called function, if any.

Chapter 19 - Name Spaces

19.1 - Purpose

Create nested scopes of variables, to enable saving and restoring of values stored in variables.

19.2 - Usage

Save contents of all non-global variables:

   push
Restore contents of all non-global variables:
   pop

19.3 - Description

All variables defined using a set instruction are kept in name spaces which are saved and restored either explicitly on issuing of push and pop instructions, or implicitly on call's to function and subroutines.

All variables defined using a globset are kept in a global name space which is unaffected by instructions push, pop and call.

Instruction process does not affect name spaces.

Chapter 20 - Miscellaneous

20.1 - Purpose

Perform miscellaneous instructions to control Tefigel operating mode and state or to access the underlying operating system.

20.2 - Usage

Execute an operating system shell command:

   system <command-string>
Print Tefigel version string:
   version
Reset Tefigel internal state to initial state:
   reset
Dump Tefigel internal state:
   dump
Tune behavior of Tefigel:
   switch <control-variable>=<control-value>
"Hook" a user-defined Tefigel subroutine to a given processing event:
   hook <hook-event>=<hook-subroutine>
Remove association between a <hook-event> and a user-defined <hook-subroutine>:
   hook <hook-event>=

20.3 - Description

A <command-string> can be any command for the command interpreter (shell) of the underlying operating system.

Instruction reset resets all Tefigel variables and filters, and sets special characters, markers, control switches and event hooks to their default start-up state.

Instruction dump prints all available information about Tefigel status, including variables and name spaces, special characters, markers, filters, control switches, event hooks, and library path.

Switch <control-variable>'s may be set to the <control-value>'s described in the following list:

A <hook-subroutine> is a Tefigel script that can be associated with one of the <hook-event>'s described in the following list:

Chapter 21 - Further Reading

Additional information on the different aspects of the Somusar/Software Production Technique[tm] can be found in the other volumes of the Somusar/SoProTech[tm] Booklet Series, listed below.

Vol. I - somusar/SoProTech: An Introduction

An introduction to the Somusar/Software Production Technique[tm], a new, fast, and efficient technology to make high-quality multifacet software.

Vol. II - somusar/SoProTech: A Sample Project

Description of a sample project, serving as a proof-of-concept for the Somusar/Software Production Technique[tm], and the Somusar/Sisendel[tm] and Somusar/Tefigel[tm] languages. A few code examples are provided and demonstrate the practical applicability of the technique.

Vol. III - somusar/Sisendel: A Tutorial Introduction

A tutorial introduction to Somusar/Sisendel[tm], describing all features of the simple software entity design language. Several code examples practically demonstrate the conciseness and flexibility of the language.

Vol. IV - somusar/Tefigel: A Tutorial Introduction

An introduction to the syntax, semantics, and usage of Somusar/Tefigel[tm], including a vast set of code examples, illustrating the powerful features of the text file generation language.

Vol. V - somusar/Sisendel: Reference Guide

Sisendel reference guide: official definition of syntax and semantics of the Somusar/Sisendel[tm] language.

Vol. VII - somusar/SoProMach: User's Guide

The Somusar/Software Production Machine[tm] User's Guide. How to install and operate SoProMach.

Vol. VIII - somusar/tjpp: User's Guide

The Somusar/tjpp[tm] User's Guide. How to install and operate the Java[tm] preprocessor.

Vol. IX - Code Generation Somusar Style

Proof-of-concept samples of what you can generate with Somusar/SoProMach[tm].