Mumps/II /MDH Toolkit
The Mumps/II Programmer's Guide
Version 10.0

Kevin C. O'Kane, Ph.D.
Computer Science Department
University of Northern Iowa
Cedar Falls, IA 50614
okane@cs.uni.edu
kc.okane@gmail.com
http://www.cs.uni.edu/~okane
http://www.omahadave.com
Sept 7, 2008

Except for the exceptions noted in the Appendices, this document is Copyright (c) 2001, 2002, 2003, 2004, 2005, 2006 Kevin C. O'Kane, Ph.D.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with the Invariant Sections being: Page 1, with the Front-Cover Texts being: Page 1, and with the Back-Cover Texts being: no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License".


A bound, printed version of the Mumps/II documentation with examples is available at Amazon.com with the title Mumps/II
The following students have contributed to this project:

Matthew J. Lockner Michael D. Janssen Chris Johnson Jacob Good Madhur S. Mitra

The software is distributed under one of several licenses (please see each source code module for specific copyright and license details applicable to that module). In general, the compiler itself is distributed under the GNU GPL license and the run-time support routines are distributed under the GNU LGPL.

  1. GNU General Public License

    This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

    This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

    You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA

  2. GNU Lesser General Public License

    This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.

    This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

    You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA

Full texts of the licenses appear at the end of this document. Compiled Mumps/II programs may call upon the Perl Compatible Regular Expression Library which, in some cases, is distributed with the Mumps/II Compiler. The license and copyright statement for PCRE appears in Appendix H.

Contents
Preface
Introduction to Mumps/II
Programming Basics Language Specification
Data Types and Values
Variables
Global Arrays
Writing Programs
Example Programs
Operators Commands Builtin Functions
Builtin Variables
  1. $horolog (Abbreviation $h)
  2. $io (Abbreviation: $i)
  3. $job (Abbreviation $j)
  4. $test (Abbreviation $t)
  5. $storage (Abbreviation $s)
  6. $x
  7. $y
Installing the Compiler
  1. Other Software Needed
  2. MS Windows Install
    1. MS Visual C++ Binary Install
    2. MS Visual C++ Full Build Details
    3. Windows XP Apache Web Server Interface
  3. Linux Install
    1. Linux/Cygwin Quick Install
    2. Linux/Cygwin Full Installation
Compiling Programs
Global Arrays
Using Mumps/II with a Web Server
Uncontrolled Program Termination
Program Formats and Error Messages
Compiler Implementation Overview
Writing Programs, Functions and Calling Conventions
  1. Program Structure
  2. Main Programs and Functions
Hybrid Mumps/II and C++ Programs
Implementation Notes
Relational Algebra for Global Arrays
Programming Examples
How To Write Web Server CGI Mumps/II Scripts
Technical Notes
Index
Appendix B - GNU GPL/LGPL and Documentation Licenses
Appendix D - Mumps/II 95 Pattern Matching
Appendix E - Using Perl Regular Expressions
Appendix G - Manual Pages
Appendix H - PCRE License
Appendix I - MAC OS/X Installation Instructions


Preface

The purpose of this document is to present an overview of the Mumps/II Compiler implementation of the Mumps/II language. This package consists of source code for compilers and support libraries for various operating systems for a dialect of the Mumps language. Mumps/II is a simple, easy to learn string and hierarchical data base scripting language originally developed for medical computing applications but is now used in many other settings. Related documentation is available at:

Mumps/II Manual
MDH/C++ Manual
Information Storage & Retrieval in Mumps/II
IDF Searching of Genomic Data Bases

Introduction to Mumps/II

Mumps (also referred to as 'M') is a general purpose programming language that supports a native hierarchical data base facility. It is supported by a large user community (mainly biomedical), and a diversified installed application software base. The language originated in the mid-60's at the Massachusetts General Hospital and it became widely used in both clinical and commercial settings. A dwindling number of implementations exist for the language. There were ANSI, ISO (ISO/IEC 11756:1992) and DOD approved standards for Mumps although these have mainly lapsed in recent years.

As originally conceived, Mumps differed from other mini-computer based languages of the late 1960's by providing: 1) an easily manipulated hierarchical (multi-dimensional) data base that was well suited to representing medical records; 2) flexible string handling support; and (3) multiple concurrent tasks in limited memory on very small machines. Syntactically, Mumps is based on an earlier language named JOSS and has an appearance that is similar to early versions of Basic that were also based on JOSS.

This software package includes both a compiler and interpreter for Mumps/II

The Mumps/II Compiler is a translator that converts Mumps/II to C++. The compiler implements much of the most recent Mumps standard (see the manual). Mumps/II programs are translated to standard C++ programs and subsequently compiled to binary executables. The compiler distribution contains the compiler source code, the manual, the run-time functions source code, all written in C/C++, and examples, written in Mumps/II. The compiler works under Linux and Cygwin (for Windows) with the 'gcc/g++' compilers and Microsoft Visual C++ 2005 under Windows.

The Mumps/II Interpreter is a compiled Mumps/II program that permits direct execution (interpretation) of Mumps/II programs. It is available under Linux, Cygwin and Microsoft Windows.

The MDH (Multi-Dimensional and Hierarchical Data Base Toolkit) is a Linux-based, open sourced, toolkit of portable software that supports Mumps/II-compatible, very fast, flexible, multi-dimensional and hierarchical storage, retrieval and manipulation of data bases ranging in size up to 256 terabytes. The package is written in C and C++ and is available under the GNU GPL/LGPL licenses in source code form. You must install the Mumps/II Compiler in order to use the MDH.

Programming Basics

  1. The Mumps/II programs described in this document can be run in either of two ways: either as interpreted code using the Mumps/II Interpreter or as binary executables resulting from application of the Mumps/II Compiler. Binary programs run faster that interpreted programs but the difference can be small if the programs are dominantly input/output jobs. The Mumps/II Interpreter is created by compiling the program "mumps.mps" provided with the distribution. For WinXP based systems, where a C/C++ compiler is not normally available, the interpreter may be s simpler and easier approach to developing applications. All programs that execute with the interpreter can be compiled by the compiler.

  2. How to Run a Program.

    Programs written in Mumps/II normally have the extension ".mps" when used with the compiler. Programs written for the interpreter, however, may have any extension but ".mps" is preferred. MDH programs written in C++ must have the ".cpp" extension.

    When you compile a Mumps/II program, a C++ translation of your program is made and resides on the disk with the same name but with the .cpp extension. The C++ translation is then compiled and linked with run-time libraries to build an executable binary. On MS Windows, the binary will have the same name as your original program but with the .exe extension. On Linux, the binary will have the extension .cgi and the same name as your original program. Depending on which system you are using, there will be other, intermediate files generated by the Mumps/II and C++ compilers. These are not important.

    You may compile a program either by using the built in script "mumpsc" or by manually performing each of the steps at the command prompt.

    To compile a Mumps/II program using the script, type:

    mumpsc myprog.mps
    
    This will translate your Mumps/II program to C++, run the C++ compiler on the result, and link the output of the C++ compiler with the standard Mumps/II libraries.

    To do the above steps individually, type the following sequence:

    mumps2c myprog.mps
    g++ -O3 -o myprog.cgi myprog.cpp -lmpscpp -lmumps -lmpsglobal_native -lpcre
    

    The first command ("mumps2c") translates the Mumps/II program to C++. The output will be named "myprog.cpp". The second command ("g++") runs the C++ compiler on "myprog.cpp" and links the result with the standard Mumps/II libraries (using the native gloabl arrays). You may wish to use this sequence if your Mumps/II programs has embedded C++ code that calls on special libraries that are not included in the "mumsc" script, for example, PostgreSQL.

    Also, use the above "g++" command to compile a C++ program generated by the Mumps/II Compiler that you may have edited (for example, to insert debugging information). You may not use the "mumpsc" script to compile a ".cpp" program generated by the Mumps/II Compiler. The "mumpsc" script may only be used to compile Mumps source programs (".mps" extensions) and MDH programs (".cpp" extensions). When "mumpsc" sees you compiling a program with the ".cpp" extension, it sets certain switches that are not appropriate for a ".cpp" program generated by the Mumps Compiler.

    To compile an MDH/C++ program using the script, type:

    mumpsc myprog.cpp
    

    To interpret a program, type:

    mumps myprog.mps

    The Mumps Interpreter runs your program from source code. Note: the interpreter is subject to several restrictions and should not be used for production work. It is intended for debugging and quick, interactive viewing of global arrays.

  3. Generally speaking, in most cases you will receive syntax error messages from which will identify the error and the line number in the original Mumps program containing the error. When using the compiler, in some cases, an error may be detected by the C++ compiler. If you get C++ error messages, the line number on the error message will refer to the line number in the C++ translation of your Mumps program. To translate this to a line number in your Mumps program, look into the generated .cpp file at the line number given in the C++ error message and then back track to the nearest prior commented Mumps source line - this is the line in your Mumps programs that caused the problem.

    For example, if you get a message from the C++ compiler saying that you have an error at line 1234 in the C++ module, open the C++ file and move to line 1234. At that location you will see something like:

    /*=================================================================================*
    svPtr->LineNumber=4; //       write "the sum is: ",total,!
    /*=================================================================================*/
           if (svPtr->out_file[svPtr->io]==NULL) ErrorMessage("Write to input file",svPtr->LineNumber);
           svPtr->hor[svPtr->io]+=fprintf(svPtr->out_file[svPtr->io],"%s","the sum is: ");
           if (sym_(SYMGET,(unsigned char *) "total",(unsigned char *) tmp0,svPtr)==NULL) 
                VariableNotFound(svPtr->LineNumber);
           svPtr->hor[svPtr->io]+=fprintf(svPtr->out_file[svPtr->io],"%s",tmp0);
           fprintf(svPtr->out_file[svPtr->io],"\n"); svPtr->hor[svPtr->io]=0; svPtr->ver[svPtr->io]++;
    

    Notice that each original line of Mumps code and its line number in the original Mumps file appear in a comment prior to the C++ translation. To find the line of Mumps code that caused the C++ error, look for the line of Mumps code next preceding the line which the C++ complier flagged as being in error. Generally speaking, you may receive C++ error messages if you reference non-existent labels or subroutines, or incorrectly specify indented do blocks (see below). It is not necessary to know C++ to do this. Also, you may see ^M (control-M) characters in the code, especially if you are viewing a MS WinXP file with a Linux editor. These are visible due the differences bewteen the operating systems. Under WinXP, each line ends in a a set of carriage-return and a line-feed characters. Under Linux, each line ends in a line-feed character only. The control-M's you see are the carriage-returns. They are harmless.

  4. Mumps has two basic classes of variables: local and global. Local variables reside in memory and exist during program execution (like normal variables) while global variables reside on disk and continue to exist after the program has terminated. Generally speaking, global and local variable names can be used interchangeably except that all globals begin with a circumflex (^).

  5. Unlike other languages which employ many data types, Mumps has basically one data type - string. Strings that contain numbers, however, can have arithmetic operations performed on them. Neither global nor local variables need to be declared - they will created as needed (note: an extension in the Mumps Compiler permits scalar variables to be pre-declared in order to improve performance). Variables can be destroyed by the KILL command.

  6. All Mumps statements begin with a keyword. A keyword can be fully spelled out or, in many cases, abbreviated. The common keywords are:

    set read write if else halt hang use open close do for quit break

    After a keyword there is one blank followed by (in most cases) an "argument". An argument may not have any embedded blanks unless they are within quotes ("..."). Blanks are delimeters in Mumps and their use is restricted.

    More than one command may be on the same line if there is one or more blanks following the previous commands argument. Two or more blanks are required if the previous command had no argument. For example:

    set a="abc" write a,!
    quit:a=b  write a,!
    

    In the first line, both an assignment and write commands are on the same line. In the second line, the quit has no argument so there are two blanks separating it from the write The figure :a=b in the quit is not an argument: it is called a post-conditional. Post-conditionals are expressions that are evaluated before command execution. If true, the command is executed. If false, the command is not executed. Most commands may have post-conditionals attached to the command word.

  7. Mumps programs that are to compiled begin with the command zmain which has no arguments and may not be followed by any other text on the line. Interpreted programs do not require zmain. In this guide, zmain will normally appear in each program. The interpreter ignores this line.

    On the other hand, if a source code program begins with a line such as:

    #!/usr/bin/mumps

    and, if the source code file has execute permissions, and the Mumps interpreter is in /usr/bin/mumps, typing the name of the source code file will cause its execution (by the interpreter).

  8. There is no precedence. All expressions are evaluated strictly left to right unless you use parens. This can cause problems if you are not careful. The expression:

    a+b-c*d/e means: ((((a+b)-c)*d)/e)
    
    Note that:
    
    if a=b&c=d write "hello",!  
    
    means:
    
    if (((a=b)&c)=d) write "hello",!
    
    which is probably not what you wanted.  You probably wanted:
    
    if (a=b)&(c=d) write "hello",!
    

  9. The main operators are +, -, *, /, \, #, **, _, >, <, [, ], ', ?, and @. See below for a description of the operators. Note that there is an integer division operator (\) as well as a floating point division operator (/).

  10. Mumps has many functions for string processing. These are covered in the manual and you should study them. Mumps also permits indirect execution, that is, your program can create and execute code dynamically.


Language Specification

Introduction

The purpose of this section is to provide you with an introduction to the Mumps language in general and the Mumps compiler. The Mumps language originated in the mid-60's at the Massachusetts General Hospital. The acronym stands for "Massachusetts General Hospital Utility Multi-Programming System". It is a language which is similar in some respects to BASIC but it contains many additional features not found in BASIC, or for that matter, in most other languages. In its full form, Mumps is an interpretive language. In fact, parts of the language specification require that it can never fully become a "compiled" language such as FORTRAN, COBOL or PL/I. In the compiler, some features, therefore, mainly those involving run time interpretation and symbol table binding, are removed. See above for details.

Among the features which make Mumps attractive for both bio-medical and general scientific applications are:

Hierarchical data base facility. Mumps data sets are not only organized along traditional sequential and direct access methods, but also as hierarchical trees whose data nodes are addressed as path descriptions in a manner which is easy for a programmer to master in a relatively short time.

Flexible and powerful string manipulation facilities. Mumps built-in string manipulation operators and functions provide programmer's with access to efficient means to effect complex string manipulation and pattern matching operations.

Transportability to widely different systems Mumps presently runs under a large number of operating systems on many machine architectures. These systems range in size from small home micro-computers to the largest central time sharing systems. Through efforts that have taken place by the Mumps Development Committee over the years, a well organized language definition has been written and formally published. This standard provides for a far tighter specification for system performance and linguistic definition than is normally the case. As a result, programs written under a Mumps system can be moved with relatively little effort from one system to another.

Full numeric data handling facilities Mumps provides, in addition to string handling facilities, a full range of fixed and floating point computational facilities.

Data Types and Values

Basically, Mumps has only one data type: string, although it does allow integer and floating point computations as well as logical expressions. The values in a string may be any ASCII code from 32 to 126 (decimal) inclusive. The Mumps Compiler does not permit usage of the ASCII zero character (<NULL>). Ordinarily, strings will contain ASCII printable characters. String constants are enclosed in double quote marks ("). A double quote mark can be included with two adjacent quote marks (""). A constant containing only numerics (and, optionally, plus, minus and decimal point), need not be enclosed by quotes (although doing so has no effect). The following are examples of valid Mumps character string constants:

"THE SEAS DIVIDE AND MANY A TIDE" "123.45" "BRIDGET O'SHAUNESSEY? YOU'RE NOT MAKING THAT UP?" """THE TIME HAS COME,"" THE WALRUS SAID"

When a string is being used as a number (e.g., in addition), the numeric portion must be 20 characters or less in length. Numeric constants are restricted to integer or decimal values (positive or negative). If a string begins with a number but ends with non-numeric characters, only the numeric leading portion will participate in operations requiring numeric operands (e.g., add, subtract, etc.); the trailing non-numeric portion is lost. On the other hand, if a string begins with non-numeric characters, its value will be interpreted as 0. The following are examples:

1+2 will be evaluated as 3. "ABC"+2 will be evaluated as 2. "1AB"+2 will be evaluated as 3. "AA1"+2 will be evaluated as 2. "1"+"2" will be evaluated as 3.

Only one plus or minus sign is permitted at the beginning of a string or the value will be interpreted as 0. Although "string" is the basic data type, Mumps converts strings internally to floating point values (double) for calculations. Consequently, numbers are of approximately 15 digit precision. Internally, integers are stored as C++ language "long" and floating point numbers as "double."

Logical values in Mumps are special cases of the numerics. A numeric value of zero is interpreted as false while a non-zero value is interpreted as true. Logical operators yield either zero or one and their results can be treated like any other numeric. Similarly, the numeric result of any numeric operator can be used as a logical operand. The results of string operators are interpreted either as zero (leading characters non-numeric) or some value (leading characters numeric). Strings and the results of string operations can therefore participate as the operands of logical operators.

Variables

Variables are named in Mumps in much the same manner they are named in other languages. A Mumps variable name must begin with a letter (A through Z) or percent sign (%) and may be followed by either letters or numbers. Both upper and lower case letters may be used for variables names. Mumps, in effect has only one data type so any variable name may be any value. All Mumps variables are varying length strings. The maximum string length permitted is determined when the compiler is installed. This number is usaullyat least 1024.

In Mumps there are no data declaration statements. Variables come into existence through the set or the read command and pass from existence through the kill command.

In the Mumps, there are two kinds of arrays: internal arrays and global arrays. The following pertains to internal arrays: arrays are not dimensioned. A name used as an array variable may also, at the same time, be used as a scalar. Array values are created by assignment or appearance in a read statement. If you create an element of an array, let us say element 10, it does not mean that Mumps has created any other elements: that is, it does not imply that there exist elements 1 through 9. You may specifically create these or not. Array indices may be positive or negative numbers or character strings. Arrays in Mumps may have multiple dimensions. The following are some examples of arrays:

set a(1,2,3)="ARRAY" read test(22) write test(22) set i=10 set a(i)=10 set a("TEST")=100 set i="TESTING" set a(i)=1001 set a("MUMPS","USERS'","GROUP")="MUG"

Global Arrays

Global arrays may be viewed either as multi-dimensional matrices or as tree structured hierarchies. For example, the following is taken from the MeSH Hierarchy. Note the original tree-like organization of the data and see the corresponding Mumps commands to represent the data in the data base. The codes used in the lines of Mumps code are the codes used by Mesh:

Cardiovascular System;A07 Blood Vessels;A07.231 Arteries;A07.231.114 Aorta;A07.231.114.056 Aorta, Abdominal;A07.231.114.056.205 Aorta, Thoracic;A07.231.114.056.372 Sinus of Valsalva;A07.231.114.056.847 Arterioles;A07.231.114.060 Axillary Artery;A07.231.114.085 Basilar Artery;A07.231.114.106 Brachial Artery;A07.231.114.139 Brachiocephalic Trunk;A07.231.114.145 Bronchial Arteries;A07.231.114.158 Carotid Arteries;A07.231.114.186 Carotid Artery, Common;A07.231.114.186.200 Carotid Artery, External;A07.231.114.186.200.210 Carotid Artery, Internal;A07.231.114.186.200.230 Carotid Sinus;A07.231.114.186.456 Celiac Artery;A07.231.114.207 Cerebral Arteries;A07.231.114.228 Anterior Cerebral Artery;A07.231.114.228.100 Circle of Willis;A07.231.114.228.351 Middle Cerebral Artery;A07.231.114.228.550 Posterior Cerebral Artery;A07.231.114.228.700 Temporal Arteries;A07.231.114.228.868

 

As matrices, data may be stored not only at fully subscripted matrix elements but also at other levels. For example, given a three dimensional matrix mat1, you could initialize it as follows:

for i=0:1:100 do . for j=0:1:100 do .. for k=0:1:100 do ... set ^mat1(i,j,k)=0

In this example, all the elements of a three dimensional matrix of 100 rows, 100 columns and 100 planes are initialized to zero.

Unlike other programming languages, however, there are additional nodes of the matrix which could have been initialized such as indicated by the following example:

for i=0:1:100 do . set ^mat(i)=1 . for j=0:1:100 do .. set ^mat(i,j)=j .. for k=0:1:100 do ... set ^mat1(i,j,k)=0

In effect, this means that mat1 can also be a single dimensional vector, a two dimensional matrix and a three dimensional matrix simultaneously.

Furthermore, not all elements of a matrix need exist. That is, the matrix can be sparse. For example:

for i=0:10:100 do . for j=0:10:100 do .. for k=0:10:100 do ... set ^mat1(i,j,k)=0

In the above, only index values 0, 10, 20, 30, 40, 50, 60, 70, 80, and 90 are used to create each of the dimensions of the array and only those elements of the matrix are created. The omitted elements do not exist.

Global arrays are unique to Mumps. As a programmer, you will work with them as though they were arrays. The system, however, interprets them as tree path descriptions for the system's external data files. A global array is distinguished by beginning with the circumflex character (^). The remainder of the specification is the same as an internal array. global arrays are not dimensioned and they may appear anywhere an ordinary variable may appear (except in certain forms of the "kill" command). A typical global array specification consists of the array name followed by some number of indices (indices may be constants, variables [including internal or global arrays] or expressions of string, numeric or mixed type). For example:

set ^a(1,43,5,99)="TEST" set ^ship("1ST FLEET","BOSTON","FLAG")="CONSTITUTION" set ^captian(^ship("1ST FLEET","BOSTON","FLAG"))="JONES" set ^home(^captain(^ship("1ST FLEET","BOSTON","FLAG")))="PORTSMOUTH" write ^ship("1ST FLEET","BOSTON","FLAG") ... CONSTITUTION write ^captain("CONSTITUTION") ... JONES write ^home("JONES") ... PORTSMOUTH write ^home(^captain("CONSTITUTION")) ... PORTSMOUTH write ^home(^captain(^ship("1ST FLEET","BOSTON","FLAG"))) ... PORTSMOUTH

The system files are viewed as trees. Each global array name ("A", "SHIP", "CAPTAIN", and "HOME" in the above) is the root of a tree. The indices are thought of as path descriptions to leaves. For example, out of the root "^a" there may be many branches, the above specifies to take the branch labeled "1" (note: this does not mean the "first" branch out of the node - it means the branch with label "1"). At the second level the specification says to take the branch labeled "43" (note: this does not imply that branches 1 through 42 necessarily exist). The path description is followed (or, possibly, created if the global array specification appears on the left hand side of an assignment statement or in a read command) to a final node. The value at the node is either retrieved or a new value stored depending upon the context in which the global array specification was used. The indices of global arrays may be numeric or character strings. The second sequence of examples above illustrates this usage.

Both string and character indices may be mixed in the same path description.

A value may be stored at any position in the tree. For example:

set ^A(1,43,5)=22 set ^A(1,43)="TEST MIDDLE LEVEL"

Mumps arrays are not pre-declared and they are sparse. That is, only those elements which you explicitly create actually exist. For example, if you create element ^A(10), it does not necessarily mean that elements ^A(1) through ^A(9) exist.

Global arrays are often interpreted as trees where each successive index describes the path through a multi-way tree. At each node, data can be stored (or not). The path from the root to a node is given by the sequence of indices of an array reference. The data base can store many trees, each distinguished by their array name.

The Mumps global array facility is due to the early uses of Mumps in medical data bases which are basically hierarchical in nature. The Mumps global arrays were a solution to the problem of how to represent the tree-like structure of patient data in a simple and easily manipulated data structure.

For example, consider a basic patient record. At the top level is the patient's id node at which is stored the patient's name. At the second level, are nodes for demographic (address, gender, phone number, etc.) data and the main entry node for clinical data. Clinical data is organized by diagnostic or problem category and each problem or diagnostic code is divided into episodes of the problem organized by onset date. For a given problem and onset, the data are divided by category (medications, lab tests, orders, notes, etc.) which are further subdivided by, for example, in the case of lab tests, test, date, time and result. For example:

Here, the tree is named patient which is also the name of the global array (notice that global arrays always have a circumflex (^) preceding their name). The Mumps code to populate the above might look like:

      set ^patient("123-45-6789")="Jones, John, J"
      set ^patient("123-45-6789","Demographics","Street")="123 Elm St"
      set ^patient("123-45-6789","Demographics","City")="Anytown"
      set ^patient("123-45-6789","Demographics","State")="IA"
      set ^patient("123-45-6789","Demographics","ZIP")="50613"
      set ^patient("123-45-6789","Dx",789.00,"6/23/2005")="Dr Smith"
      set ^patient("123-45-6789","Dx",789.00,"6/23/2005","lab","HCT","6/23/2005","10:45",45.2)=""
      set ^patient("123-45-6789","Dx",789.00,"6/23/2005","lab","HCT","6/23/2005","20:45",43.2)=""
      set ^patient("123-45-6789","Dx",789.00,"6/23/2005","lab","HCT","6/24/2005","21:10",44.2)=""
      set ^patient("123-45-6789","Dx",789.00,"6/23/2005","lab","HCT","6/25/2005","14:10",44.2)=""

Notice that the empty string can be stored at a node. In these cases, the actual data (the lab test result) is the value of the final index. Also note that each intermediate node need not be created. The nodes representing ""Demographics", "lab", "Dx", HCT, and others are not explicityly created. Their creation is implicit in constructing the longer paths of which they are intermediates.

Writing Programs

Mumps was originally conceived as a language to be executed by interpreters. Consequently, the syntax and structure of commands tend to be line oriented and simple to parse.

Each command in Mumps begins with a key word. The keyword may be abbreviated, in many cases to a single letter (this was to save storage and time on the early interpreters but has no effect on efficiency in the Mumps Compiler). A command is followed optionally be a post-conditional expression and then, in most cases by one or more arguments. The blank character is a terminator in Mumps so the placement of blanks is important. A blank is used to separate a command keyword from its arguments and to terminate the list of arguments. If a command has no arguments and there are others commands following on the line, the command (or, the command and its post-conditional) are followed by at least two blanks. Blanks may not be embedded in expressions except in quoted fields.

A post-conditional is an expression that will be evaluated prior to executing the arguments of a command. If the expression is true, the arguments will be executed. If the expression is false, the arguments will not be executed and control will move to the next command. Some commands, such as the if and the else may not be post-conditionalized. Other commands, such as the do and the goto may not only be post-conditionalized at the command level, but also at the argument level. Most commands are only post-conditionalized at the command level. Here are some valid commands with and without post-conditionals (";" causes the remainder of the line to be interpreted as a comment):

set i="hello world" set i="hello",j="world" ; multiple arguments to the "set" set:a=10 i="hello world" ; only executed if "a=10" is true goto label1 gotto:b<a label1 ; executes only if "a<b" is true goto label1:c>d ; executes only if "c>d" goto lab1:c>d,lab2:d<e ; multiple post-conditionals quit:a=b set i="hello" ; argumentless "quit" followed by 2 blanks goto:a=10 lab1:c>d,lab2:d<e ; multiple post-conditionals

The basic assignment statement in Mumps is the set statement. It may have one or more arguments. The left hand side is evaluated and the result is assigned to the variable on the left hand side. Expressions in Mumps are evaluated strictly left-to right without precedence. If you want a different ordering of the evaluation, you must use parentheses. This is true in any Mumps expression in any Mumps command. It is a common source of error, especially in if commands with compound predicates.

Variable in Mumps, as noted above, can be local, that is, present only during the execution of the program, or global, which is to say, disk resident and persistent from one execution of the program to the next. Both local and global variables may be either scalar or arrays although most local variables tend to be scalar and most global variables tend to be arrays in practice.

A variable, either global or local, is created by appearing on the right hand side of a set statement (or as an argument in a read statement). The Mumps symbol table is dynamic and variables are created as needed. They may also be destroyed (kill).

Expressions may involve both local and global variables, operators (see below) and constants. Constants may be either numeric or string. String constants are enclosed in double quits (use two immediately adjacent double quotes to create a single double quote in a quoted string). Strings are converted to numerics as needed and back again. The basic internal data type is string. For example (in each of these, the item written is followed by a <new-line> character due to the "!"):

set i=2 write i,! ; writes 2 set i=2*3 write i,! ; writes 6 set i="2"*3 write i,! ; writes 6 set i=" 2"*3 write i,! ; writes 6 set i="""hello""" write i,! ; writes "hello" set ^a="test" write ^a,! ; writes test set a="test" write a,! ; writes test set ^a(1)="tst" write ^a(1),! ; write tst set a(1)="tst" write a(1),! ; write tst

The last four examples use global and local scalars, respectively.

The input/output commands are read and write. They may each use formatting codes. The formatting codes are:

! - new line (!! two new-lines, etc.) # - new page ?x - advance to column "x" (new-line if needed)

The read command may contain out put codes and string constants along with the variable names to be read. This permits prompts to be embedded in the read (note: this only applies when reading from the console). For example:

read !,"enter name:",?20,x

This will cause the prompt "enter name:" to be written at the beginning of a new line (!) and then the cursor will tab to column 20 and await the user's input which will be stored in variable "x".

Examples of the write command:

write "hello world",! set i="hello",j="world" write i," ",j,! set i="hello",j="world" write i,!,j,! write 1,?10,2,?20,3,?30,4,!!

The first example above writes "hello world" followed by a new-line. The second does the same. The third writes "hello" and "world" on separate lines. The fourth writes 1, 2, 3, and 4 in columns 10, 20,30 and 40, respectively followed by two new-lines.

The argumented form of theif statement Mumps tests one or more expressions for true and does or does not execute the remainder of the line depending upon the result. A value is considered to be "true" if it is non-zero and "false" if zero. If there are multiple arguments, each argument must be true. Note that the if command has as its scope the entire remainder of the line on which it appears, not just the next command. As a side effect, the if statement sets the system built-in variable $test to "true". And's ("&") and or's ("|") may be used in expressions along with not's ("'"). The relational operators are cited below. Examples:

1 set i=1,j=2,k=3 2 if i=1 write "yes",! ; yes 3 if i<j write "yes",! ; yes 4 if i<j,k>j write "yes",! ; yes 5 if i<j&k>j write "yes",! ; does not write 6 if i<j&(k>j) write "yes",! ; yes 7 if i write "yes",! ; yes 8 if 'i write "yes",! ; does not write 9 if '(i=0) write "yes",! ; yes 10 if i=0|(j=2) write "yes",! ; yes

Note the multiple arguments on line 4 above. This effectively constitutes an "and" operation. Also note line 5. Due to strict left-to-right, non-precedence parse, the expression is interpreted as though it had been written "(((i<j)&k)>j)". The "i<j" is true (1); "1&k" is true (1) but "1>j" is false so the expression is false. The expression on line 6 is probably what was wanted. It is very important to include parentheses in compound expressions to express precedence.

The argumentless form of the if statement (which is followed by two or more blanks) tests the current value of $test and does or does not execute the remainder of the line depending upon if $test is true (1) or false (0). For example:

set i=1 if i=1 write "yes",! if write "yes again",!

The above will write "yes" and "yes again" because the first if sets $test to true. A negative side effect of this is that $test can be reset to false in the scope of the first if:

set i=1 if i>1 write "yes",! if i>2 write "yes yes",! if write "yes again",!

The above will write "yes" only. The second if on the second line sets $test to false. For compiled code, $test is restored to its previous value upon exit from a block:

set i=1 if i=1 do . if i=0 write 123,! else write 987,!

The false if expression sets $test only within the block. The previous value is restored on exit. Note: $test is not restore from block exit in the interpreter.

The principle iterative command is the for command. It has the syntax of a local, scalar index variable followed by an equals sign followed by one or more arguments. The arguments may either be specific values the index variable should have or a range specification. A range specification is either two numbers separated by a colon or three numbers separated by two colons. In both cases, the first number is the initial value for the index variable, and the second is the increment (or decrement if negative). The third number, if present, is the limit value. If no third number is specified, there is no upper limit and some other mechanism must terminate the loop. For example:

for i=1:1:10 write i,! ; 1,2,...9,10 for i=1,3,5,7 write i,! ; 1,3,5,7 for i=10:-1:0 write i,! ; 10,9,...2,1,0 for i=1:1 write i,! ; 1,2,3 ... for i=1:1:10,100,200 write i,! ; 1,2,...9,10,100,200 for i=10,20,30:1:40 write i,! ; 10,20,30,31,...40

An argumentless for loops forever until stopped by another command. The quit command will terminate the nearest command on the current line:

for i=1:1 write i,! quit:i>10 ; 1,2,...9,10

If there are multiple for commands, the quit terminates the nearest:

for i=1:1:10 for j=1:1 write j,! quit:j>5

the above will write 1 through 5 a total of 10 times.

The for is often used with the argumentless form of the do command. The argumentless form of the do assumes there will be a block of code immediately following the current command line. The block will be executed. For example:

for i=1:1:10 do . write i . write " ",i*i,!

The above will write the numbers 1 through 10 followed by the square of the number.

In the compiled code, when you enter a block by means of an argumentless do, the value of $test is stored and restored on normal exit from the block. If you exit the block by means of a goto, the value of $test is not restored. $test is not restored under any circumstances in the interpreter.

In the context of a block, the quit command means to quit the block, not the for. For example:

for i=1:1:10 do . write i . if i>5 write ! quit . write " ",i*i,!

The above will write 5 lines with numbers and their squares and 5 lines with just the numbers (6 through 10). The quit terminates the block but not the for. A break (this is a non-standard feature of the Mumps Compiler) may be used to exit a block:

for i=1:1:10 do . write i . if i>5 write ! break . write " ",i*i,!

The above will only write 5 lines. The break terminates the block and the for. A break may only be used in a block. Alternatively, the quit could be used as follows:

for i=1:1:10 do write ! quit:i>4 . write i . if i>4 quit . write " ",i*i

The above will write 5 lines, 4 of which have the number and its square. Note the two blanks after the argumentless do.

Another form would be:

set i=1 for do . write i," ",i*i,! . set i=i+1 . if i>10 break

The above will write 10 lines, each with two numbers.

Blocks may be nested but the break only applies to the block in which it appears:

set i=1 for do . write i," " . set j=1 . for do .. write j," " .. set j=j+1 .. if j>10 break . write ! . set i=i+1 . if i>10 break

The above will write ten lines, each with 11 numbers (1 number for the outer loop and 10 numbers due to the inner loop).

The do command is also used to invoke subfunctions. See the discussion and examples below under the do command.

For input/output, Mumps uses a scheme based on unit number. The Mumps compiler permits 10 unit numbers. By default, your program begins using unit 5, the console screen and keyboard (stdin and stdout in Linux terminology). You may open files and associate them with unit numbers through the open command and disassociate files from unit numbers with the close command. It is important that you close files that you have used for output as this will insure that the buffers have been written to disk. The use command tells the system which I/O unit to use. This unit remains in effect until changed. For example, the following copies the contents of file "aaa.dat" to file "bbb.dat" which is created if it did not previously exist:

zmain open 1:"aaa.dat,old" open 2:"bbb.dat,new" write "copying ...",! for do . use 1 . read rec . if '$test break . use 2 . write rec,! close 1,2 use 5 write "done",!

Each read command sets $test to true is successful. On end of file, $test is set to false and the loop terminates. the ",old" and ",new" markers indicate if the file is being opened for input (",old") or output (",new").

Example Programs

Mumps is a very simple but powerful string manipulation and data base scripting language. It is similar in structure to early forms of BASIC and can generally be learned in a few hours. The following are a series of examples of Mumps programs and explanations of the code.

  1. Program to sum the first 100 numbers:

    zmain set total=0 for i=1:1:100 set total=total+i write "the sum is: ",total,!

    Notes:

    • A program can be written using a text editor such as vi, emacs or some other editor that does not embed special characters in the text (as some word processor programs do).

    • Lines of Mumps code may begin with a blank, a plus sign (+), a pound sign (#), a character, or a letter.

      • If a line begins with a pound sign, the remainder of the line is taken as a comment and not processed by the compiler.
      • If a line begins with a plus sign, the remainder of the line is taken to be C++ code and is passed unchanged to the C++ compler.
      • If the line begins with a letter, the word beginning with the letter up to the first blank on the line is taken as a label. There must be at least on blank following a label.
      • If the line begins with a character or one or more blanks, the line has no label. The code for the line begins at the first non-blank character.

    • Each line has one or more commands, each beginning with a keyword. Most commands have operands which are separated by exactly one blank from the command word. In cases where the command has no operands, the command is either the final item on a line or it is seprated from the next command by two or more blanks. Operands to commands may not have embedded blanks except in quoted fields.

    • Every program has at least one zmain command that identifies the main program. The zmain must be on a line by itself.

    • In the example above, the set introduces the assignment command. The variable "total" is created (first use) and assigned the value zero.

    • The for command is the iterative command and can have 3 arguments: starting value, increment and limit. It can also have other forms (see below). The scope of the for command shown is the current line.

    • The write command writes variables, strings, numbers, etc. Each element is separated from the others by a comma. The "!" means new line. You may also use something of the form ?15 to mean tab to position 15 (a numeric constant is required).

  2. Program to read a file of numbers and sum them.

    zmain set sum=0 for do . read val . if '$test break . set sum=sum+val write "sum is ",sum,!

    • The program reads numbers until it receives an end-of-file. The loop cotrol is a the for command with no arguments. Note there are 2 blanks after the word for. These are REQUIRED as there is no argument. A for with no arguments loops until a quit (single line context) or a break (multiple line do context).

    • The input command in Mumps is read. It reads from the current I/O device (see below, by default, this is stdin). Mumps reads a line at a time thus multiple values should, in this case, be on separate lines (note: there are, however, functions to extract separate values from a line - see below).

    • The do command is used to invoke a section of code as a subroutine. It often is followed by an argument giving the name of the entry point of the code module being invoked. However, you use the do without an argument if you want to invoke a block of code contained on a group of lines immediately following in the code. The block of lines must must each begin with one or more "." characters to express the level of depth and in order to tell the compiler the extent of the block. You may have blocks within blocks. Nested blocks have more leading "." characters. A block ends when the indent (".") level reduces. In the example above, the block of code invoked by the do command is the three lines immediately following the do, each having one ". " preceding the code on the line.

    • Although the for has single line scope, the do effectively extends the scope to subsequent lines. The break, in this case, is a non-standard extension that exits both the do and its controlling for.

    • A break is not permitted in a single line for - use quit. A quit on a single line for terminates a for. HOWEVER, a quit in a do block means "continue" - that is, skip the remainmder of the block and return to the command following the controlling do. This often results in the block being re-invoked.

    • The builtin variable "$test" indicates if something worked or not. If the previous read succeeded (i.e., did not achieve end-of-file), "$test" is true, false otherwise. The single quote "'" is the "not" sign so the expression in the if command means "not $test" which will be true if the read failed (end of input).

    • Some Mumps commands such a for and if have a scope that extends through the end of the line on which they appear. This can cause some unusual consequences.

      For example, if the previous program could have also been written with multiple commands following on the same line as the for as:

      zmain set sum=0 for read val if '$test quit set sum=sum+val write "sum is ",sum,!

      it would not work. This is because of the if command. If the if expression is not true, the remainder of the line is not executed. Thus, if the read were successful, the expression in the if would be dalse and the remainder of the line, including the set statement, would not execute. There is a way around this sort of problem:

      zmain set sum=0 for read val quit:'$test set sum=sum+val write "sum is ",sum,!

      Here the "not $test" is applied as a postconditional (note the ":") to the quit command. When a postconditional is applied, the command is only effective if the postconditional is true. the quit command on a one line for statement means to terminate the loop. Because the quit has no argument (a postconditional is not an argument), at least two blanks are required before the next command. Most commands may have postconditionals. Overall, you should probably use the do and a dotted indet block.

  3. Write a program to read a line of text and separate the words.

    zmain read line set line=$zb(line) // replaces multiple blanks by one blank for i=1:1 do . set word=$piece(line," ",i) // separates the string delimited by " " . if word="" break . write word,! when given the input: How much wood would a wood chuck chuck if a wood chuck would chuck wood? produces: How much wood would a wood chuck chuck if a wood chuck would chuck wood?

    Notes:

    • The function $zb() returns a string from which multiple blanks have been replaced with single blanks. Builtin functions in Mumps begin with a dollar sign ($). Functions whose names begin with a "z" are extensions from the legacy language standard. A complete list of these functions can be found in the manual.

    • The for of the for command in the example initializes "i" to 1 (the first constant after the equals sign) and increments it by 1 (second constant) after each iteration. There is no upper limit.

    • The builtin function $piece(source,pattern,start[,end]) returns a string from "source" delimited by "pattern" The string returned follows the "start"-1 instance of the pattern and ends at the "end" instance of pattern if "end" is present, "start" otherwise. An empty string is returned if the function cannot extract the requested piece of the source string. For example:

      set a="abc.def.ghi.jkl" write $piece(a,".",1) -> abc write $piece(a,".",2) -> def write $piece(a,".",1,2) -> abc.def

  4. Write a program to read two numbers and a numeric operator, perform the operation and report the result. That is, use indirection (interpretation).

    zmain read first read second read op if first'?.N write "operand 1 must be numeric",! halt if second'?.N write "operand 2 must be numeric",! halt if $length(operator)'=1 write "operator must be a single character",! halt if $find("+-*/#",op)=0 write "unknown operator",! halt set str=first_op_second write str,"=",@str,! input: 10 2 * output: 10*2=20

    Notes:

    • The expression first'?.N is an example of the pattern match operator (?). It means "first not-pattern .N" or, the expression is true if the contents of first are not consistent with the pattern ".N". The figure ".N" means "some number of numerics" ("." means any number - including zero, "N" means numeric). See the manual for other examples and pattern matching codes.

    • The halt command terminates the program.

    • The $length() function returns the length of a string. The $find() function (see manual) returns an integer that is zero (false) if the second string is not found in the first string. It returns a positive integer if it is found. See the manual for a description of what the integer means. In the case of the example, the function is used to determine if the operator is one of the allowed operators.

    • The "_" operator means concatenate two strings together.

    • The "@" is the indirection or interpretation operator. It causes the expression following it to be evaluated and executed. Interpretation is slow but handy. In the example, the string "str" conrtains an expression composed of the values read in from the keyboard with the operator between them (an arithmetic expression). When the "@" operator is applied to a string containing a valid Mumps expression, the system evaluates the expression (at run-time) and returns a string containing an answer.

      Some legal forms for indirection:

      write @"2+3",! // writes 5 set ^a(1)="3*3" write @^a(1),! // writes 9 set ^a(9)="4+4" set i="8+1" write @^a(@i),! // writes 8

      Indirection should be used sparingly since it is inefficient.

    • You can trap errors from the interpreter with:

      zmain set x="1/0" try NoMessages set i=@x catch InterpreterException write "error",! halt

    • The statement:

      write @(first_op_second),!

      also works.

  5. Write a program that reads in an expression and then applies it to each line a file:

    zmain read "enter expression: ",exp open 1:"input.dat,old" if '$test write "file open error",! halt for do . use 1 . read line . if @exp use 5 write line,! example input expression: $find(line,"abc")

    Notes:

    • The program will, by interpretation, apply the expression to weach line of input. If the expression is "true", the input line will be printed.

    • The "open" statement opens an input file. Open files are referred to by unit numbers that you pick (1 thru 4, normally). The ",old" part means the file exists. If you are creating a file, use ",new". If the file open fails, $test is false.

    • The unit number for your keyboard/screen is 5. It is the only unit number that you can both read and write too. A common error is to try to write to an input file or read from an output file. When you do a read or write, Mumps uses the current unit number, usually 5. If you want another, use the use command. After a unit number is specified, it remains in effect until changed.

  6. Write a program to read a file and build a dictionary of the words found in the file and count the total number times each word occurs. Input is fom stdin.

    zmain for do . read line . if '$t break . set line=$zb(line) // replaces multiple blanks by one blank . for i=1:1 do .. set word=$piece(line," ",i) // separates the string delimited by " " .. if word="" break .. if '$data(^dict(word)) set ^dict(word)=1 .. else set ^dict(word)=^dict(word)+1 // caution - "else" isn't what it seems set word="" for do . set word=$order(^dict(word)) . if word="" break . write word," ",^dict(word),!

    • The $piece() function returns the ith piece of the first string delimited by the second.

    • The $data() function is true if the global array node exists. It is false otherwise.

    • The (i>$order() function returns the next ascending value of the final index of the global array argument or the empty string if there are no more. Indices are retrieved in collating sequence order (ASCII). If the value of the argument is the empty string, the function returns the value of the first index, if one exists. On sucessive iterations, the value to the function is the current index value which results in retrieving the next index value, if one exists.
    • The else command (notice the two blanks after it - it has no arguments) is not really connected to the if statement. In Mumps, else checks the value of $test and executes the remainder of the line if $test is false. No preceding if is required - the else is purely related tio $test. One consequence of this is that if you have an if statement that executes (its expression is true), and an else on the following line, if the line executed as a result of the if causes $test to become false, the subsequenct else will execute! Note also, an if command may also have no arguments (followed by at least two blanks). If this is the case, it tests $test and executes the remainder of the line it is on if $test is true.

    • Notice the double nested loops expressed by the level indent dots. The outer loop reads input while the inner loop processes each line. When the inner loop terminates, the outer loop interates.

    • The program reads a line from standard input, removes extraneous blanks, the extracts successive words from the line (until the string returned from $piece() is empty). Each word extracted is checked against the data base. If the word is not in the global array as an index ($data() returns non-zero), intialize the global array element for this word as zero; otherwise, increment the value stored in the global array - the count of the number of times this work has occurred. When the input is exhausted, the program prints the words (indices) and the count stored at each one.

  7. Write a program to produce an indented display of the indices of a global array tree. Assume the tree has no more than 4 levels of depth.

    Notes:

    • The program is written as a set of four nested for loops.

    • The outer loop iterates the first index; the first nested loop iterates, the second index, and so forth.

    • Should the global array have fewer than four levels of indices, the program willl still work. If there are no indecies as a given level three, for example, the third nested loop will immediately return an empty string and the loop will terminate. The fourth nested loop will not be attempted.

    • Notice that prior to each loop, the index variable is set to the initial value of empty string and that the value returned by $order() is tested for empty string. The first value presented to $order() is empty and the last value returned is empty.

    zmain for i=1:1:5 for j=1:1:5 for k=1:1:5 for l=1:1:5 set ^tree(i,j,k,l)=i for lev1="":$order(^tree(lev1)):"" do . write lev1,! . for lev2="":$order(^tree(lev1,lev2)):"" do .. write ?5,lev2,! .. for lev3="":$order(^tree(lev1,lev2,lev3)):"" do ... write ?10,lev3,! ... for lev4="":$order(^tree(lev1,lev2,lev3,lev4)):"" do .... write ?15,lev4,! Both programs produce output such as: 1 1 1 1 2 3 4 5 2 1 2 3 4 5 3 1 2 3 4 5 4 1 2 3 4 5 5 1 2 3 4 5 2 1 1 2 3 4 5 2 1 2 3 4 5 3 1 2 3 4 5 4 1 2 3 4 5 5 1 2 5 . . .

    In the example, the for loops initialize, increment and terminate the loops automatically. Each loop initializes its loop variable with the empty string, immediately runs the incrementing function, and tests the results for empty string. If the empty string is returned, the loop terminates. Otherwise, the loop body is executed. Notice that after initialization, the increment is done immediately thus providing the true initial value for the loop variable. This occurs only when a function replaces the increment position. Normally, in iterative loops, the loop is executed once with the initial value prior to adding the increment. There are other extended forms fo the for command. See the manual.

Operators

  1. Arithmetic unary operators: + -

    The arithmetic unary operators are: + and -. The plus operator (+) has no effect other than to force the expression to its right to be interpreted as numeric. The minus operator forces numeric interpretation and negates the result. For example:

    set I="123 ELM STREET" write +I yields 123 write -I yields -123

  2. Arithmetic binary operators: + - * / \ **

    The addition (+), subtraction (-), multiplication (*) and exponentiation (**) operators perform in the normal manner. Operands are given a numeric interpretation if necessary. Operands may be either expressions, constants, variables or array references. Results are computed in floating point if appropriate. Mumps has two division operators: full division (/) and integer division (\). Full division give results which may have fractional parts. Integer division truncates the answer to an integer.

    The modulo operator (#) gives the left operand modulo the right operand. The following are examples:

    2+3 yields 5 2.31+1 yields 3.31 3-5 yields -2 7/4 yields 1.75 7\4 yields 1 11#3 yields 2 (please see compiler notes) 3**2 yields 9

  3. Arithmetic relational operators: > <

    The greater than (>) and less than (<) relational operators compare numbers. If the operands are not numbers, they are given a numeric interpretation. The result is either zero for FALSE or one for TRUE. For example:

    1>2 yields 0 2>1 yields 1 1<2 yields 1 2<1 yields 0

    Both operators may be negated producing not greater than ('>) and not less than ('<) (note: the single quote mark is the negating operator). There is no "less than or equals" or "greater than or equals" operators as such. For example:

    1'>2 yields 1 2'>1 yields 0 1'<2 yields 0 2'<1 yields 1

  4. String binary operator: _

    The only binary string operator is concatenation (_) represented by an underscore character. The following are examples:

    "ABC"_"XYZ" yields "ABCXYZ" "ABC"_123 yields "ABC123" 123_456 yields 123456

  5. String relational operators: = [ ] ]] ?

    The equals relational operator (= and '=) tests for equality as in the following example:

    if "ABC"="ABC" write "EQUALS" if "ABC"'="ABC" write "EQUALS"

    We would expect that "EQUALS" would be written to the terminal in the first case and not in the second. The not-equals operator is formed by the single quote mark and the equals sign. The equals and not-equals operator may be used with strings or numbers.

    The contains operator ([ and '[) determines if the right hand operand is contained in the left hand argument. For example:

    set A="NOW IS THE TIME" if A["THE" write "YES" if A'["THE" write "YES"

    The word "YES" would by printed on the terminal for the first example and not printed for the second.

    The follows and sorts after operators (] '] ]] and ']]) are used to test if the left hand operand follows the right hand operand in the collating sequence. For example:

    set A="ABC" if A]"AAA" write "YES" if A]]"AAA" write "YES" if A']"AAA" write "YES" if A']]"AAA" write "YES"

    The word "YES" would be printed at the terminal for the first two exmples and not printed for the second two.

    The pattern matching operator (? and '?) is used to determine if a string conforms to a certain pattern. Pattern match operations are converted to Perl Compatible Regular Expressions and are executed by functions in the PCRE library which must be present. You may access the PCRE directly, using Perl expression format with the ^perlmatch(string,pattern) function discussed in Appendix E.

    The Mumps patterns are:

    A for the entire upper and lower case alphabet. C for the 33 control characters. E for any of the 128 ASCII characters. L for the 26 lower case letters. N for the numerics P for the 33 punctuation characters. U for the 26 upper case characters. A literal string.

    A pattern code is made up of one or more of the above, each preceded by a count specifier. The count specifier indicates how many of the named item must be present. Alternatively, an indefinite specifier - a decimal point - may be used to indicate any count (including zero). For example:

    set A="123-45-6789" if A?3N1"-"2N1"-"4N write "OK" if A'?3N1"-"2N1"-"4N write "OK" set A="JONES, J. L." if A?.A1",".A write "OK" if A'?.A1",".A write "OK"

    Extensions to the pattern syntax suggested by the Mumps 95 standard, including support for alternation, are supported as described in Appendix D.

  6. Logical operators: &, ! '

    The logical operators AND (&), OR (!) and NOT (') may be applied in the usual manner. The user should note, however, that since Mumps has strict left-to-right precedence, the results can sometimes be odd:

    1&1 yields 1 2&1 yields 1 1&0 yields 0 1&0<1 yields 1 1&(0<1) yields 1 1!1 yields 1 1!0 yields 1 0!0 yields 0 2!0 yields 1

    The "NOT" operator may be used in conjunction with other operators to form compound operators. The resulting compound operators are:

    '< not less than '> not greater than '= not equal '[ not contains '] not follows '? not pattern

  7. The indirection operator: @

    The indirection operator permits direct interpretation of string expressions. For example:

    set a="2+2" write @a,!

    will result in the value "4" being written to the current output device.

Commands

Each statement in Mumps begins with a unique command word. Often the command word is abbreviated to a single character. The single character abbreviations are unique for all commands except those which begin with the letter "Z". For commands not beginning with the letter "Z", Mumps does not check the spelling of the command word if more than one character of the spelling is given. Generally speaking, only the first letter is used to determine the command. Thus "write", "W", and "WRIGHT" all have the same meaning.

The format of a command normally consists the command word or letter followed (optionally) by a post-conditional, followed by exactly one blank followed by the arguments to the command. Most commands can have multiple arguments. Multiple arguments are delimited by commas. If a line is to have more than one command, the first command is delimited by exactly one blank and the next command word or letter follows immediately. Blanks are very significant in Mumps.

As noted above, most commands may be "post-conditionalized". A post-conditional is a logical expression which is used to determine if the command (and all its arguments) should be executed. It is like a small "if" statement. A post-conditional appears as a colon followed by an expression. If the expression evaluates to 0 (false), the command (or argument) is not executed. If the expression evaluates non-zero, the command or argument is executed.

The following are examples of the above:

an ordinary assignment statement:

set I=10*5

same as above with command word abbreviation:

s I=10*5

an assignment statement with multiple arguments:

s I=10*5,J=5,K=I+J ;(K will equal 55)

an assignment statement post-conditionalized:

s:I=10 J=0 ;(set J to zero if I equals 10)

a multiple command line:

s I=10*5 s j=5 s S=I+J ;(same as above)

Table of Commands

  1. break (abbreviation: b)

    Mumps usage is non-standard. It may be used to exit an indented block. It may not be used in any other context.

    For example:

    for i=1:1:10 Do . write i,! . if i>5 break

    The above will print 1 through 6. See Notes on break and quit for further details.

  2. catch (abbreviation: ca)

    (compiler only) A catch command immediately follows a try command. If one of the commands on the try command line fails, the catch is entered if the cause of the failure is the exception noted in the argument to the catch command. The permitted arguments are:

    1. InputLengthException
    2. InterpreterException

    The try command may be followed by one or more optional control words. If no control words are present, the command MUST be followed by at least two blanks. The optional control words are:

    1. NoMessages

      Used when invoking the interpreter to suppress interpreter error message generation. If NoMessages is not present and there is an error detected during interpretation, the interpreter will generate specific error messages to stdout.

    Example:

    try read a catch InputLengthException write "line too long",! halt write a set a="2+2" try NoMessages write @a,! catch InterpreterException write "expression error",!

  3. close (abbreviation: c

    The close command closes a unit number and makes it available for other uses. It also frees the system buffers for other uses. The argument must evaluate to a number which corresponds to an open unit. In Mumps this must be in the range of 1 to 4 (other values will terminate your program). An output file must be closed explicitly. Failure to do so may result in loss of some or all of the file. The close command may be post-conditionalized and it may have multiple arguments.

    close 4 close I close:K=J 1,2

  4. continue (no abbreviation)

    See quit

  5. declare (no abbreviation)

    The non-standard declare command can be used to improve speed by declaring Mumps variables as permanent variables in the C++ environment. A variable so declared, does not need to be looked-up in teh run-time symbol table each time it is used. This can significantly improve speed of execution. Only unsubscriped Mumps variables may be declareed. They may not be killed. Variables must be declared within the main program or a subroutine (i.e., not globally) and they have the scope of the entire routine in which they are declared. Declared variables may be passed to subroutines but a received parameter may not be declared in the scope of the receiving subroutine. Examples:

    zmain declare i,j,k for i=1:1:10 do . for j=1:1:10 do .. for k=1:1:10 do ... set ^a(i,j,k)=0

  6. do (abbreviation: d)

    • The do command with an argument causes the program to branch to the label specified by the argument and continue execution beginning at that label. Execution proceeds until a quit command is encountered at which time execution returns to the invoking do. If an end of source file is encountered, it is interpreted as a halt command.

      • Interpreter only: The do may invoke disk resident, uncompiled, Mumps routines.

    • The do command may be post-conditionalized and its arguments may be post-conditionalized.

    • The argumentless do causes the code on the immediately following line to be executed. This line must be the beginning of a block with a level of dotted indent greater than that of the line containing the do. This block is terminated by a quit, a break or by encountering a line with a lower level of dotted indent. An argumentless do must be followed by two blanks (unless it is the last command on a line). It may be Post-Conditionalized.

    • Within an argumentless do block, a quit will return to the line containing the original do and resume execution of that line. A break will exit the block but resumes execution with the line following the block.

    • See the discussion on the effect of break and quit on blocks invoked by do commands.

    • do commands with arguments may be post-conditionalized at the argument level.

    • do with arguments:

      Arguments may have three forms:

      1. Names of labels local to the current routine, with or without parameters:

        zmain write "hello " do www write ! halt www write "world" quit --------------------------- zmain write "hello " do www("world") write ! halt www(a) write a quit --------------------------- zmain write "hello " write $$www("world"),! write ! halt www(a) quit a

        Each of the above writes "hello world" by executing a local subroutine. Note that invoking the routine as a function requires the "$$" while invoking it with a do does not include the "$$".

      2. Names of subroutines linked to the current executable module such as the following:

        ^www() write "world" quit zmain write "hello " do ^www write ! halt --------------------------- ^www(a) write a quit zmain write "hello " do ^www("world") write ! halt --------------------------- ^www(a) quit a zmain write "hello " write $$^www("world") write ! halt ---------------------------

        Note the empty open/close parentheses in the first example. These are required at present. For a thorough explanation of the meaning of "$$" and "^" before function labels, see the section on Functions .

      3. Name of other binary executables. These are other Mumps programs (although, the could be written in other languages if the conform to the calling conventions) and compiled to binary executables. All are assumed to have the ".cgi" extension:

        zmain write "hello " do ^pgm1 write ! halt where pgm1 is pgm1.cgi compiled from the following: zmain write "world" halt

        Programs invoked in this manner may not, at present, have parameters nor may they return values. They do, however, have all the variables and values from the calling program's symbol table available and any new variables the called program may create or any changes to existing variables will be reflected in the calling program's symbol table upon return. This method is relatively slow and should be used sparingly. It creates a separate shell, stores the current symbol table on disk (in a file whose name is the same as the calling program's process id), and invokes the program. The called program, upon invocation, loads the symbol table, executes, dumps the symbol table back to disk and terminates. The shell created terminates and control is returned to the calling program which loads the revised symbol table.

      The embedded interpreter will permit constructions of the form:

      do ^ABC that will cause the file "ABC" to be loaded and interpreter. This format for compiled code causes the execution of a separately compiled Mumps function.

      If the label following the do command begins with two circumflexes (^), a separately compiled C++ function is invoked. The function may be declared in the current file or in a separate file. Functions from separate files may be pre-compiled and linked. See the section on Functions If a separately compiled function has no arguments, the open and close parentheses are not required when invoked by do.

      A label of the form xxx^yyy may be used in the compiler for separately compiled functions. The xxx is interpreted as a label appearing in the routine yyy.

    • do without arguments: (see: Notes of break and quit for details)

      for i=1:1:10 do write " continuing" . write !,i . if i&gt;5 quit . write " ",i write !,"done",! writes 1 1 continuing 2 2 continuing 3 3 continuing 4 4 continuing 5 5 continuing 6 continuing 7 continuing 8 continuing 9 continuing 10 continuing done ------------------------------------------ set i=9 if i&gt;0 do write " continuing" . write !,i . if i&gt;5 quit . write " ",i write !,"done",! writes: 9 continuing done ------------------------------------------ for i=1:1:10 do write " mark " do write " end of line",! . write i . if i&gt;5 quit . write "X" writes: 1X mark 1X end of line 2X mark 2X end of line 3X mark 3X end of line 4X mark 4X end of line 5X mark 5X end of line 6 mark 6 end of line 7 mark 7 end of line 8 mark 8 end of line 9 mark 9 end of line 10 mark 10 end of line ------------------------------------------ for i=1:1:10 do write " continuing" . write !,i . if i&gt;5 break . write " ",i write !,"done",! writes: 1 1 continuing 2 2 continuing 3 3 continuing 4 4 continuing 5 5 continuing 6 ------------------------------------------ set i=9 if i&gt;0 do write " continuing" . write !,i . if i&gt;5 break . write " ",i write !,"done",! writes: 9 done ------------------------------------------ for i=1:1:10 do write " mark " do write " end of line",! . write i . if i&gt;5 break . write "X" write ! writes: 1X mark 1X end of line 2X mark 2X end of line 3X mark 3X end of line 4X mark 4X end of line 5X mark 5X end of line 6

  7. else (abbreviation: e)

    The else command tests the value of the system wide built-in variable $test. If $test (abbreviated as $t) is zero, the remainder of the line on which the else appears is executed. If $t is not zero, the remainder of the line is not executed. $test is set, among other ways, by the if statement. Since else does not take arguments, it must be followed by two blanks.

    For example:

    else set i=10

    The else does not require a preceding if statement. It depends solely on the value of $test. See the discussion below on details as to how $test may be set.

  8. export (abbreviation: ex)

    The export command exports the variables given as arguments to the next outer symbol table:

    ^tst(i,j) write "in subroutine ",$data(i)," ",$data(j)," ",$data(k),! set i=10,j=20,k=30 write "in subroutine ",i," ",j," ",k,! export k quit zmain write "in main",! set i=1,j=2,k=3 write "before call to subroutine ",i," ",j," ",k,! do ^tst(i,j) write "after call to subroutine ",i," ",j," ",k,! writes: in main before call to subroutine 1 2 3 in subroutine 1 1 0 in subroutine 10 20 30 after call to subroutine 1 2 30

  9. for (abbreviation: f)

    Iterative loop command. Examples of the single line scope For:

    for i=1:1:10 write i,! // prints 1 through 10 on successive lines for i=10:-1:1 write i,! // prints 10 through 1 on successive lines for i=1:2:10 write i,! // prints 1, 3, 5, 7, and 9 on successive lines for i=1:1 write i,! quit:i>100 // no upper limit format set i=0 for set i=i+1 write i,! quit:i>100 // infinite loop format for i=1,4,6 write i,! // explicit values of index for i=2,5,10:1:20,30,100:1:105 write i,! // mixed formats - iterative and explicit for write "hello",! // write hello infinitely for i=$order(^a(i)) write i,! // writes all the indices of ^a() for i=2:$order(^a(i)):8 write i,! // writes indices of ^a() between 2 and 8

    If the for has no arguments (the "do forever" format), it must be followed by two blanks.

    Mumps Compiler non-standard extensions permit other forms of the for loop. The first of these, shown in the example below, permits a function to be executed for each iteration. In this format, the increment is replaced by a function call. The loop variable is initialized and, for each iteration, the function is invoked and its result is copied to the loop variable. The loop iterates until the value of the loop variable becomes the third argument or endlessly, if the third argument is omitted. The function is invoked immediately after initialization before the first iteration. Thus, the first value of the variable in the loop is not the initialized value but the result of the first function execution.

    zmain for i=1:1:10 s ^a(i)=i // initialize a global array e1 for i="":$order(^a(i)):"" write ^a(i),! // write all values of ^a() write ! kill ^a for i=1:1:3 for j=1:1:3 for k=1:1:3 set ^a(i,j,k)=i for a="^a(1)":$query(a):"" do . if $piece(a,"(",1)'="^a" break // $query() returns all globals . write a," -> " . write $qsubscript(a,0)," ",$qsubscript(a,1)," ",$qsubscript(a,2)," ",$qsubscript(a,3),! write ! for i="":$order(^a(i)):"" do . for j="":$order(^a(i,j)):"" do .. for k="":$order(^a(i,j,k)):"" do ... write i," ",j," ",k,! writes: 1 10 2 3 4 5 6 7 8 9 ^a("1","1","1") -> a 1 1 1 ^a("1","1","2") -> a 1 1 2 ^a("1","1","3") -> a 1 1 3 ^a("1","2","1") -> a 1 2 1 ^a("1","2","2") -> a 1 2 2 ^a("1","2","3") -> a 1 2 3 ^a("1","3","1") -> a 1 3 1 ^a("1","3","2") -> a 1 3 2 ^a("1","3","3") -> a 1 3 3 ^a("2","1","1") -> a 2 1 1 ^a("2","1","2") -> a 2 1 2 ^a("2","1","3") -> a 2 1 3 ^a("2","2","1") -> a 2 2 1 ^a("2","2","2") -> a 2 2 2 ^a("2","2","3") -> a 2 2 3 ^a("2","3","1") -> a 2 3 1 ^a("2","3","2") -> a 2 3 2 ^a("2","3","3") -> a 2 3 3 ^a("3","1","1") -> a 3 1 1 ^a("3","1","2") -> a 3 1 2 ^a("3","1","3") -> a 3 1 3 ^a("3","2","1") -> a 3 2 1 ^a("3","2","2") -> a 3 2 2 ^a("3","2","3") -> a 3 2 3 ^a("3","3","1") -> a 3 3 1 ^a("3","3","2") -> a 3 3 2 ^a("3","3","3") -> a 3 3 3 1 1 1 1 1 2 1 1 3 1 2 1 1 2 2 1 2 3 1 3 1 1 3 2 1 3 3 2 1 1 2 1 2 2 1 3 2 2 1 2 2 2 2 2 3 2 3 1 2 3 2 2 3 3 3 1 1 3 1 2 3 1 3 3 2 1 3 2 2 3 2 3 3 3 1 3 3 2 3 3 3

    Alternatively, other starting and ending values can be specified:

    for i="":$order(^a(i)):12 write ^a(i),! // writes values up to but not including 12 for i=2:$order(^a(i)):8 write ^a(i),! // writes values 3 through 7, inclusive

    Another format is:

    set i="" for i=$order(^a(i)) write ^a(i),! writes all level 1 values of ^a() global. do $zwi("this is a test string") for i=$zwn write i,! writes: this is a test string

    where the loop iterates so long as the value of the loop variable does not become zero or the empty string. The Function is invoked before each iteration, including the first. The first example iterates through all values fo the global array. The second example initializes the internal buffer with the string then extracts each word from the string until there are no more (returns an empty string).

    Note: the order of presentation of the indices is determined by the collating sequence in the above.

    Important Notes on break and quit

    • A quit in a single line for terminates processing of the for. If there are multiple for commands, it terminates the nearest:

      for i=1:1:10 write i,! if i>5 quit writes 1 through 6 only. for i=1:1:10 for j=1:1:10 write j,! if j>5 quit writes 1 through 6 ten times. for i=1:1:10 write i,! if i>5 break illegal - generates error message - break not allowed.

    • A break may NOT be used in a single line for command. It may ONLY be used in an indented block that was introduced by a do command.

    • In an indented block, quit and reak have special meanings:

      • A quit ends further processing of the block in which it appears and returns to the line containing the invoking do at a point just after the do. Processing of the line containing the invoking do resumes. If there are more commands on the line, they are executed.

      • A break ends further processing of the block in which it appears but does not return the line containing the invoking do. Instead, execution moves to the line following the block in which the do appears. Examples:

      for i=1:1:10 do write " continuing" . write !,i . if i>5 quit . write " ",i write !,"done",! writes 1 1 continuing 2 2 continuing 3 3 continuing 4 4 continuing 5 5 continuing 6 continuing 7 continuing 8 continuing 9 continuing 10 continuing done

      In this example, the block is invoked 10 times. After each invocation, the remainder of the line containing the for is executed producing the instances of the word "continuing". Each block invocation prints the value of "i". When the value of "i" is greater than 5, the block executes the quit command thus returning to the invoking line early. When the value if "i" is 5 or less, the full block is executed and return is made to the invoking line at block end. When the for command finishes execution, control is passed to the line following the for and "done" is printed.

      set i=9 if i>0 do write " continuing" . write !,i . if i>5 quit . write " ",i write !,"done",! writes: 9 continuing done

      In this example, the block is entered, the value of "i" is printed but, because "i" is greater than 5, the quit is executed and control is returned to the invoking do and the word "continuing" is printed. Now, the line being completely executed, control passes to the line following the block and "done" is printed.

      for i=1:1:10 do write " mark " do write " end of line",! . write i . if i>5 quit . write "X" writes: 1X mark 1X end of line 2X mark 2X end of line 3X mark 3X end of line 4X mark 4X end of line 5X mark 5X end of line 6 mark 6 end of line 7 mark 7 end of line 8 mark 8 end of line 9 mark 9 end of line 10 mark 10 end of line

      In this example, multiple do commands are shown. Note the two blanks following each. Each do invokes the block following the line containing the do

      On the other hand, the break command terminates the the block in which it is contained but execution does not return to the line containing the invoking do but, instead, continues with the line following the block:

      for i=1:1:10 do write " continuing" . write !,i . if i>5 break . write " ",i write !,"done",! writes: 1 1 continuing 2 2 continuing 3 3 continuing 4 4 continuing 5 5 continuing 6 --------------------------------------- set i=9 if i>0 do write " continuing" . write !,i . if i>5 break . write " ",i write !,"done",! writes: 9 done --------------------------------------- for i=1:1:10 do write " mark " do write " end of line",! . write i . if i>5 break . write "X" write ! writes: 1X mark 1X end of line 2X mark 2X end of line 3X mark 3X end of line 4X mark 4X end of line 5X mark 5X end of line 6

      In these examples, execution of the break can be seen to terminate the current block and move to the line following the block.

    • Note: the contents of $test revert to their former value when exiting an indented block by means of break or Mb>quit:

      if 1=1 do . write "test 1: ",$test,! . if 1=2 write "wow",! . else write "not wow",! . write "test 2: ",$test,! write "test 3: ",$test,! writes: test 1: 1 not wow test 2: 0 test 3: 1

      If you exit a block with a goto, the value of $test is not restored.

    see: Unsupported and Non-Standard Features

  10. goto (abbreviation: g)

    The goto command causes unconditional transfer of control. In the compiler, local program labels and functions may be used. Variables are not permitted. Label names must not be the same as variable names. Both the command and each argument may be post-conditionalized. If you exit an indented block with a goto, the value of $test is not restored to its pre-block value.

    Permitted argument forms:

    goto label goto ^function

    The second example causes the current program to terminate and control to be passed to a different executable name "function.cgi". This is accomplished by an "execve()" system function. The symbol table is dumped prior to transfer and loaded by the target executable. Control is never returned to the invoking program except if the invoked program cannot be executed.

  11. halt (abbreviation: h)

    The halt command terminates execution of the Mumps program and returns you to the operating system or web server. It may be post-conditionalized. For example:

    halt:I=3

  12. hang (abbreviation: h)

    The hang instruction suspends execution of your program for a specified period of time (in seconds). It takes as an argument the number of seconds to wait. It may be post-conditionalized. The hang instruction differs from the halt instruction only in the argument: a hang without an argument is a halt instruction. For example:

    hang:I=J 2*K

  13. html (abbreviation: html)

    The html command causes the remainder of the line to be treated as html code and sent to the standard output. The command may not be abbreviated. The remainder of the line is scanned first for embedded Mumps expressions which are replaced by their results:

    &! codes which will be replaced by line feeds (same as the ! in the write statement.

    &~expression~ codes whose expression will be evaluated and the result substituted for the entire code. Thus, assuming the Mumps variable ptid exists:

    html < center > Patient &~ptid~ < /center >

    The value of ptid will be substituted in to the line and the result will be sent to the standard output.

  14. if (abbreviation: i)

    The if command permits conditional execution. The scope of the command is the remainder of the line unless there is an argumentless do command on the line. If one of the arguments to the if is false, execution procedes to the next line. Thus the following is in error:

    set i=2 if i=3 write "no",! else write "yes",!

    The else is NOT executed.

    Note that expressions are evaluated left to right. This sometimes causes problems for people used to dealing with other languages. For example, the expression:

    I=0&J<0

    is always false since it is parsed as: (((I=0)&J)<0)

    if I is zero, the first expression is true (value of 1); if J is less than zero, then J is interpreted as true giving, as a result of the AND operation (&), a value of 1 which is not less than zero - therefore false. If I is not zero then, regardless of the value of J, the AND operation results in false (value of zero) which is not less than zero - therefore false. The expression should have been written as:

    (I=0)&(J<0)

    The if commnad may be used with no arguments (requires 2 blanks after command word) or with multiple arguments separated by commas. If no arguments are given, the current value of $test is examined and, if true, the remainder of the line containing the if is executed. If multiple arguments are given, each is evaluated. If all are true, the remnainder of the line is executed. If an argument is not true, the remainder of the line including any unevaluated arguments are not executed and execution resumes on the next line.

    The if command sets $test: if the argument(s) is/are true, $test is set to true, false otherwise.

    An if command with single line scope may initially set $test to true, but a subsequent command on the same line may set it to false. Thus, on the following line, $test may be false even though the expression in the previous if was true (see example below).

    The value of $test is restored upon return from nested blocks except if those blocks are exited with a goto command.

    See the discussion on the effect of break and quit on blocks invoked by if commands.

    Examples:

    set i=1,j=2,k=3 if i=1 write "true",! // writes true if write "true",! // writes true if i=1,j=2,k=3 write "true",! // writes true if i=1,j=3,k=3 write "true",! // does not write true if i=1 do . write "true",! // writes true if i=1 if j=3 write "true",! // writes true write $test,! // writes 1 if i=1 if j=9 write "true",! // does not write true write $test,! // writes 0 if i=1 do . write "test 1 ",$test,! . if j=20 do .. write "test 2 ",$test,! . write "test 3 ",$test,! write "test 4 ",$test,! writes: test 1 1 test 3 0 test 4 1

  15. job (abbreviation: j)

    The job command performs a fork(). The global arrays are not closed prior to the fork.

    If you are using the native globals, you must close them with the ZCLOSE command and reopen them in either the parent or child. Both processes may not open the globals concurrently. If both attempt to open the globals after the fork, one will pend until the other has closed the globals (zclose).

    With the Berkeley DB, both processes may access the globals and there is no need to close them prior to the job command.

  16. kill (abbreviation: k)

    The kill command is used to prune the symbol table and to delete parts of the global arrays.

    There are three forms of the kill command: the first deletes all entries in the local (i.e., non-global) symbol table; the second deletes specific elements from the local symbol table or specific elements from the global arrays; and the third is used to delete all elements from the local symbol table except for certain named symbols. All forms may be post-conditionalized. The first form - delete the entire local symbol table - is denoted by the kill command alone:

    kill

    The second form appears as a list of references (note: indirection for the names is permitted):

    kill A,B,^G(1,2,33)

    The above would delete variables A, B and the global array node ^G(1,2,33). Note: if the global array node ^G(1,2,33) has descendants, they are also deleted. Also, if a local array node is deleted, any of its descendants are also deleted.

    The final form of the kill may be used to delete all elements from the local symbol table except for certain protected elements. It has the following format:

    kill (A,B(1,1),K)

    In the above, the local symbol table will be deleted except for variables A, B(1,1) and K. All other variables will be lost. Global array nodes may not be used in this form of the kill statement.

    Indirection is permitted.

  17. lock (abbreviation: l)

    The lock command marks a portion of the data base for exclusive access for an individual user. A lock with no arguments frees all prior lock's. The locks are stored in a file named "Mumps.Locks" which is opened for exclusive access by the locking/unlocking job. The contents of the fill may be deleted to remove all locks.

    Examples:

    lock // removes all locks lock ^a(1) // locks ^a(1) and all children; lock +^a(1) // increment lock count for ^a(1) lock -^a(1) // decrement lock count for ^a(1) lock ^a(1):20 // lock ^a(20) with a 20 second timeout lock ^a(1),^b(1) // multiple locks

  18. merge (abbreviation: m)

    The merge command copies from one array to another. At present, only global arrays may participate in the merge command. Examples:

    for i=1:1:9 for j=1:1:9 set ^a(i,j)=i+j merge ^c=^a // copies all of ^a to ^c for i=100:1:109 s ^b(i)=i merge ^b(103)=^a(3) // copies ^a(3) to ^b(103) and children of // ^a(3) to be children of ^b(103) merge ^d=^a(3) // creates ^d=^a(3); ^d(1)=^a(3,1),...

    Note: in code of the form "merge ^x(103)=^a(3)", ^x(103) does not exist if "^x(103)" did not previously exist and if "^a(3)" does not exist. In the above examples, there "^a(3)" does not exist (but it does have descendants).

  19. new (abbreviation: n)

    The new command creates a new symbol table or namespace and pushes the old namespace onto a stack. The Mumps compiler implementation of new differs from the legacy standard Mumps. See also: export

    If new is used with no arguments, a new namespace is created, any previous namespaces are pushed-down and unique variable names from prior namespaces are instantiated with no values in the new namespace. Upon exit from the block in which the new was executed, the old namespace is restored and the contents of the new namespace are lost. A block is determined to be exited if you execute a quit from a section of code entered by a do without arguments or you revert to a lower level of dotted indent.

    A call to a separately compiled function results in an implied 'new'. A call to a local function with parameters also results in an impled 'new'. A call to a local function without parameters does not result in an implied 'new'. See the examples below.

    If you include an argument to new consisting of a parenthesized list of variable names, a new namespace will be created that contains instantiated copies, with no values, of the unique variable names from lower namespaces except those variables names in the parenthesized list.

    If you include one or more variable names as an argument, not enclosed in parentheses, new copies of these variables will be created. Upon exit, the old copies will be restored.

    At any given level, there may be only one new command.

    Examples:

    zmain # Example 1 'new' # 'new' creates a new symbol table and # pushes down the old table. The new table # is empty. Upon exit from th 'do' block, # the old table is restored and contents of # new symbol table are lost. set i=1,j=2,k=3 write "before ",i," ",j," ",k,! do . new . set i=100,j=200,k=300 . write "in block ",i," ",j," ",k,! write "after ",i," ",j," ",k,! # writes 1 2 3 # Example 2 'new exclusive' # 'new' creates a new symbol table and # pushes down the old table. Variable # 'i' is created in the new table along # with its contents. Otherwise, the new # table is empty. Upn block exit, the # contents of the new table are lost, # including any changes to 'i' and the # old table is restored along with the # old value of 'i'. set i=1,j=2,k=3 write "before ",i," ",j," ",k,! do . new (i) . set j=200,k=300 . write "in block ",i," ",j," ",k,! write "after ",i," ",j," ",k,! # writes: # before 1 2 3 # in block 1 200 300 # after 1 2 3 # Example 3 'new inclusive' # 'new' pushes down the old symbol table and # creates a new symbol table. The new table # contains all the contents of the old table # including an instance of 'i' which contains # the empty string. Since 'i' was in the old # symbol table, its previous value is not # visible. Upon exit from the block, the # old symbol table becomes visible with none # of the variables changed. No values are copied # to the old symbol table. set i=1,j=2,k=3 write "before ",i," ",j," ",k,! do . new i . write "in block ",i," ",j," ",k,! write "after ",i," ",j," ",k,! # writes: (value of 'i' in block is the empty string) # before 1 2 3 # in block 2 3 # after 1 2 3 halt # Example 4 - implied 'new' with separately compiled functions # A call to a separately compiled function results in an # implied 'new' for the function. The old symbol table is # restored on exit. ^tst(a,b) write "in subroutine ",$data(i)," ",$data(j)," ",$data(k),! set i=10,j=20,k=30 write "in subroutine ",i," ",j," ",k,! quit zmain write "in main",! set i=1,j=2,k=3 write "before call to subroutine ",i," ",j," ",k,! do ^tst(5,20) write "after call to subroutine ",i," ",j," ",k,! # writes: # in main # before call to subroutine 1 2 3 # in subroutine 0 0 0 # in subroutine 10 20 30 # after call to subroutine 1 2 3 # Example 5 - implied 'new' with call by reference. # Same as above except that changes to the arguments # passed by reference (.i,.j), are returned to the # calling symbol table. ^tst(i,j) write "in subroutine ",$data(i)," ",$data(j)," ",$data(k) set i=10,j=20,k=30 write "in subroutine ",i," ",j," ",k,! quit zmain write "in main",! set i=1,j=2,k=3 write "before call to subroutine ",i," ",j," ",k,! do ^tst(.i,.j) write "after call to subroutine ",i," ",j," ",k,! # writes: # in main # before call to subroutine 1 2 3 # in subroutine 1 1 0 # in subroutine 10 20 30 # after call to subroutine 10 20 3 # Example 6 - implied 'new' on local functions with parameters # If a local function (one without '^' at the beginning of its # name) is called, an impled 'new' is performed: zmain set i=10 set j=20 set k=30 write "main program: ",i," ",j," ",k,! do test(i,j,k) write "main program: ",i," ",j," ",k,! halt test(a,b,c) write "sub-program: ",a," ",b," ",c,! set a=11 set b=22 set c=33 quit # writes: # main program: 10 20 30 # sub-program: 10 20 30 # main program: 10 20 30 # Example 7 - implied 'new' on local functions with call by # reference: (similar to separately compiled functions above) zmain set i=10 set j=20 set k=30 write "main program: ",i," ",j," ",k,! do test(.i,.j,.k) write "main program: ",i," ",j," ",k,! halt test(a,b,c) write "sub-program: ",a," ",b," ",c,! set b=22 set c=33 quit # writes: # main program: 10 20 30 # sub-program: 10 20 30 # main program: 10 22 33 # Example 8 - no implied 'new' for local functions without parameters # A local function called without parameters shares the same symbol # table as the calling routine. All variables are available and any # changes are visible upon return: zmain set i=10 set j=20 set k=30 write "main program: ",i," ",j," ",k,! do test write "main program: ",i," ",j," ",k,! halt test write "sub-program: ",i," ",j," ",k,! set j=22 set k=33 quit # writes: # main program: 10 20 30 # sub-program: 10 20 30 # main program: 10 22 33

  20. open (abbreviation: o)

    The open command opens sequential files and associates them with unit numbers. Opened files may be read or written. This implementation permits unit numbers 1 through 4 and 6 through 9 to be used by the programmer. Unit 5 is reserved for the normal user console (stdin and stdout). The open command takes a device number (either a number or an expression which evaluates to a number in the range of 1,2,3,4,6,7,8,9) and a file name. The file name must either be a variable name or a quoted literal. The file name consists of a valid file name followed by either ,new ,APPEND or ,OLD.

    If followed by ,new, the program assumes you are opening the file for output: any previous files with this name are lost You may only write to this file. If you open the file with the ,OLD option, the program assumes the file exists and opens it for input only (reads). If you specify ,APPEND, the file is opened for output with all new data written to the file appearing after the previously existing data. If an error takes place, $test is set to zero and the remainder of the command is not inspected. If no error takes place, $test will be one. The user should not attempt to reference units for which the open command returned a $test value of 0. For example:

    open 1:"data,old" open 1:"data,new" open 1:"data,append"

    You should be careful to close any file you have opened for output in order not to loose any of the file's contents. You must close an open unit number before re-using the unit number in another open command.

  21. quit (abbreviation: q)

    The quit command provides an exit point for a FOR, xecute (interpreter only) and do commands. It may be post-conditionalized. It normally takes no arguments (therefore, you need two spaces after it if there are any commands following it on the same line). In the FOR case, the quit terminates the nearest loop. In the do case, the quit returns to the most recently invoking command. In the xecute case, execution of the xecute text is terminated and control is returned to the original command line. The continue is an alias for the quit

    Arguments are permitted for the quit command if the command is being used to return from a block invoked by a function. For example:

    Write $$abc,! . . . abc quit "test"

    The above will print "test".

    see: Notes on break and quit

  22. read (abbreviation: r)

    The read command reads data into variables. It may be post-conditionalized. Ordinarily the read command reads from the user's terminal (unit number 5). The read command can be redirected to other devices and files, however, by use of the use command (see below). It is a common source of error - sometimes quite destructive errors - for the user to read or write to the wrong device. Many Mumps programmers explicitly place the use command immediately prior to the read or write commands. Ordinarily, the read command takes one or more arguments which may be local scalar variables, local array elements, or global array elements. Each of these is read successively from the input device. When more than one argument is present, a carriage return / line feed is taken as the delimiter between the successive input values. For example:

    read A,B,^A(1,3,99)

    If the input is derived from the user's terminal, the user might type the following sequence in response to the above:

    22 38 NOW IS THE TIME

    The read command may also write before it reads. This mode is only permitted at the user's console. The options permitted for "write before read" are: a literal constant in quotes; and a tab, new line, or new page operation. A tab operation is specified by a question mark followed by an expression which is interpreted as arithmetic. The effect is to cause the cursor to move to the named column on the page or screen. The new line operation is caused by typing an exclamation point (!). The program will generate a carriage return / line feed pair. A new page is induced by the pound-sign (#) character. For example, if you wish to read name, social security number and password with user input from column 20, the following might be appropriate:

    read "NAME",?20,N,"SSN",?20,S,"PASS",?20,P

    After the above, the variable N will contain the name, S the social security number and P the password.

    The read command also permits single character input. That is, a read operation will be satisfied as soon as the user strikes any character on the keyboard: no carriage return is required. The variable will contain a number which is the equivalent of the ASCII character struck. This mode is denoted by preceding the variable to be read by an asterisk. For example:

    read "ENTER A LETTER ",*A

    The read is satisfied when any character is struck. There is also another form of the read: that which contains "time-outs". Time-outs permit the programmer to specify a maximum interval of time which the program should wait for the user to reply to the read operation. The time-out may be used with either the regular or character by character mode. The time-out is specified by placing a colon after the variable followed by an expression which will be interpreted as numeric. The value of the expression is the amount of time in seconds to wait for a user response to this operation. If the user fails to respond in the required interval, the $test built-in variable is set false (0) and the variable contains nothing. If the user does respond in time, $test is set true (1) and the variable contains the user's reply. For non-character by character mode input, the user must type the carriage return for the input to be valid. If the time-out expires before the user type the carriage return, all input is lost. For example:

    AGAIN read "ENTER NAME ",N:20 if '$test goto AGAIN

    Multiple variables may receive the same value by enclosing them in parentheses. For example:

    read (a,^b(1),c)

    If the input is "999", each variable in the list, "a", "^b(1)" and "c", will receive the value "999".

  23. set (abbreviation: s)

    The set command is the assignment command for Mumps. The expression to the right hand side of the equals sign is evaluated and placed in the storage associated with the variable on the left hand side. Global variables may be used on the left hand side. The set command may be post-conditionalized. The $piece function may be used on the left hand side of an expression. For example, if the variable a contains the value "aaa.bbb.ccc", the command:

    $p(a,".",2)="xxx"

    will result in the value of a becoming "aaa.xxx.ccc"

    Multiple values may be set at the same time in the compiler:

    set (i,^a(1),j,^a(2)=99

    The above will cause i,^a(1),j, and ^a(2) to be each set to the value 99. The $piece() function may not be used on any of the items in a multiple set.

  24. try (abbreviation: t)

    (compiler only) The try command has as its scope the current line. The commands on the line are executed. If any of the commands throws an error, execution terminates and control is transferred to the catch command on the following line. If no errors are thrown, execution skips the catch command and resumes with the line following the catch. A try command MUST be followed immediately be a catch command. Neither the try nor catch commands may contain a do command. The try command takes optional control word arguments (see below). If none are present, it MUST be followed by at least two blanks. Example:

    try read a catch InputLengthException write "line too long",! halt write a

    The try command may be followed by one or more optional control words. If no control words are present, the command MUST be followed by at least two blanks. The optional control words are:

    1. NoMessages

      Used when invoking the interpreter to suppress interpreter error message generation. If NoMessages is not present and there is an error detected during interpretation, the interpreter will generate specific error messages to stdout.

    Example:

    try read a catch InputLengthException write "line too long",! halt write a set a="2+2" try NoMessages write @a,! catch InterpreterException write "expression error",!

  25. use (abbreviation: u)

    The use command tells the program which device (unit) number to use for input output operations (reads and writes). The unit designated as the input/output device remains in effect until changed by the use again or an error occurs. If the program detects an error it always resets the the current device number to 5 - the number associated with the user's console terminal. The valid range of unit numbers in Mumps is 1 through 5. The current unit number can be determined from the $IO (abbreviation: $I) built-in variable. The command is specified as the letter U or the word use followed by an expression which is interpreted as numeric. For example:

    use 1

  26. view (abbreviation: v)

    The view command is not used in this implementation.

  27. write (abbreviation: w)

    The write command transmits data to an output device or file. Output goes to the file or device associated with the current unit number (see $io). Normally output is directed to the user's console terminal ($io=5, stdout). By the use command, however, data may be directed to another unit number and its associated file. All unit numbers other than unit 5 must be opened prior to use (unit number 5 is already open when the program begins and may not be otherwise opened). The write command takes as arguments either literals in quotes, numeric constants, variable names, both local and global, and control codes and expressions for tab, new line and new page operations. The control operations are the same as those discussed above for the read operation. Output is stream oriented: that is, each write does not begin on a new line: it begins where the last line left off. Wrap-around occurs at your terminal depending upon the current terminal monitor level width setting. You may specify single character output by an asterisk followed by a decimal number. The system will send the ASCII character associated with the number you specify (e.g., *7 will send the BELL character). For example:

    write "Name of Patient",?25,name write !,"Age",?20,age write *65,*66,*67,! set i="10*2" write @i

  28. Xecute (abbreviation: x)

    The xecute command permits execution of strings as though there had been lines of code. In the Mumps Compiler, xecute is performed by a resident interpreter. Interpretation should be avoided, if possible, as it is considerably slower than direct execution of compiled code. At present, the xecute may not be followed by another command on the same line.

    Examples:

    set test="for i=1:1:20 write i,!" xecute test

    The numbers 1 through 20 will be written to the current output device.

  29. Z (abbreviation: z)

    The Z in ANSI Mumps permits the implementor to add special commands. The Mumps Compiler does not support Z type commands of other vendors.

      zclose - Close the global array files. Causes all internal buffers to be written. Globals will automatically be re-opened if a reference to a global is subsequently made.

    • zexit - Indicates end of Mumps code section. If your program consists of one or more Mumps routines followed by one or more C++ functions, you must place a zexit command at the end of the last Mumps routine. The zexit command causes the final Mumps epilogue to be written. If this is not done, the following C++ functions will be interpreted, erroneously, as part of the last Mumps routine.

    • zmain - The zmain command is used to introduce the main program to the compiler. It is discussed in detail below. It and the remainder of the line on which it appears are ignored by Mumps programs.

Builtin Functions

  1. $ascii(e1) or $ascii(e1,i2)

    $ascii returns the numeric value of an ASCII character. The string is specified in e1. If no i2 is specified, the first character of e1 is used. If i2 is specified, the i2'th character of e1 is chosen. For example:

    $ascii("ABC") YIELDS 65 $ascii("ABC",1) YIELDS 65 $ascii("ABC,2) YIELDS 66 $ascii("") YIELDS -1

  2. Special Text Functions

    The following builtin functions are compiled from the MDH Toolkit. They are compiled C++ functions that can be called directly from Mumps. Builtin functions can be added to the MDH package by modifying builtins.cpp and mumpsc/builtin.h.

    When invoked as Mumps functions, use the $zz... or $z... format. This is required ifn you are using the interpreter. If you are using the compiler, you may use either the $zz... and $z... format or the slightly faster compiler interface to the same functions.

    For the compiler interface, if invoked as functions with return values, each name is preceded by "$$^". If invoked by the do command, each name is preceded by the "^" only.

    For both forms, parameters are specified in the same manner. For example, the $zzCosine() and ^Cosine() functions, both of which return an answer, can be invoked with:

    set c=$zzCosine(^A,^B)
    set c=$$^Cosine(^A,^B)

    The builtin functions include:

    1. Document Similarity Functions

      $zzCosine(gbl1,gbl2) or ^Cosine(gbl1,gbl2)

      $zzSim1(gbl1,gbl2) or ^Sim1(gbl1,gbl2)

      $zzDice(gbl1,gbl2) or ^Dice(gbl1,gbl2)

      $zzJaccard(gbl1,gbl2) or ^Jaccard(gbl1,gbl2)

      Computes the Cosine/Sim1/Dice/Jaccard similarity coefficient between first and second arguments. Both arguments are numeric global array vectors. For example:

      zmain kill ^A kill ^B set ^A("1")=3 set ^A("2")=2 set ^A("3")=1 set ^A("4")=0 set ^A("5")=0 set ^A("6")=0 set ^A("7")=1 set ^A("8")=1 set ^B("1")=1 set ^B("2")=1 set ^B("3")=1 set ^B("4")=0 set ^B("5")=0 set ^B("6")=1 set ^B("7")=0 set ^B("8")=0 write "Cosine=",$zzCosine(^A,^B),! write "Sim1=",$zzSim1(^A,^B),! write "Dice=",$zzDice(^A,^B),! write "Jaccard=",$zzJaccard(^A,^B),! Output: Cosine=0.75 Sim1=6 Dice=1 Jaccard=1

    2. $zzAvg(vector) or ^Avg(gbl_vector)

      Computes and returns the average of the numeric values in the vector. For example:

      zmain for i=1:1:10 set ^a(99,i)=i set i=$zzAvg(^a(99)) write "average=",i,!

      The above writes 5.5

    3. $zzBMGSearch(arg1,arg2) or ^BMGSearch(arg1,arg2)

      Boyer-Moore-Gosper Function - returns the number of non-overlapping occurrences of arg1 in arg2. These functions, wer obtained from ftp://ftp.uu.net/usenet/comp.sources.unix/volume5/bmgsubs.Z were written by Jeffrey Mogul (Stanford University), based on code written by James A. Woods (NASA Ames) and are believed to be in the public domain.

      zmain write $zBMGSearch("now","now is the now of the now in the know "),! writes: 4

    4. $zzCentroid(gblMatrix,gblRef) or ^Centroid(gblMatrix,gblRef)

      A centroid vector gblRef is calculated for the invoking two dimensional global array gblMatrix. The centroid vector is the average value for each for each column of the matrix. Any previous contents of the global array named to receive the centroid vector are lost. The global array gblMatrix must contain at least two dimensions. For example:

      zmain for i=0:1:10 do . for j=1:1:10 do .. set ^A(i,j)=5 set %=$zzCentroid(^A,^B) for i=1:1:10 write ^B(i),! writes: 5 5 5 5 5 5 5 5 5 5

    5. $zzCount(gblVector) or ^Count(gblVector)

      Computes and returns the number of numeric values in the vector. For example:

      zmain kill ^a for i=1:1:10 set ^a(99,i)=i set i=$zzCount(^a(99)) write "count=",i,!

      The above writes 10

    6. $zzScan([maxLen]), $zzScanAlnum([maxLen]), ^Scan() or ^ScanAlnum()

      Returns the next word in the current input stream delimited by white space. Words are restricted to a maximum length of 1023. For ^ScanAlnum(): words beginning with one or more digits are ignored although words containing digits other than in the first position are returned; and, by default (see below), words shorter than 3 characters or longer than 25 characters are ignored. Words returned by ^Scan() may contain any printable ASCII character. ^Scan() returns all characters in input words delimited by 'whitespace' without any conversion to lower case. Words returned by ^ScanAlnum() will contain only lower case alphabetic and numeric characters. ^ScanAlnum() converts upper case characters to lower case. Both functions will advance to additional lines as needed. If a word exceeds 1023 bytes, the results are undefined. When there are no more input words, an empty string is returned and $test is set to false. If only part of a line is scanned as a result of ^Scan() or ^ScanAlnum(), a subsequent 'read' command will begin at the white space following the last word read by ^Scan() or ^ScanAlnum().

      The default word size limits for ^ScanAlnum() may be changed by including embedded C++ code prior to invoking ^ScanAlnum(). If you change the limits, they remain changed until the program ends or you change them again and the change effects all calls to ^ScanAlnum() regardless of whether in the current or some other subroutine. See example below. Either or both word limits may be changed. If the limits are less than zero or greater than 1023, respectively, a NumericRangeException() will be thrown. Note: for interpreter execution, you must modify the interpreter driver "mumps.mps" and add the embedded C++ code prior to the interpreter being invoked.

      $zzScan and $zzScanAlnum work the same except each may take an optional argument specifying the maximum length word to be returned.

      Note: if scanning input from 'stdin' (i/o unit 5), you signal end of file by a control-d (control-z on DOS) on a separate line by itself.

      For the input line: now -- __ ?? !@#$%^&*()_+= IS 2for the time for zmain for set i=$zzScan quit:'$test write i,! writes: now -- __ ?? !@#$%^&*()_+= IS 2for the time for for set i=$zzScanAlnum() quit:'$test write i,! writes: now the time for To alter the scan word size limits, include C++ lines such as the following: + svPtr->ScanMinWordSize=5; + svPtr->ScanMaxWordSize=10; zmain for set i=$$^ScanAlnum() quit:'$test write i,! on input: a aa aaa aaaa aaaaa aaaaaa aaaaaaa aaaaaaaa aaaaaaaaa aaaaaaaaaa aaaaaaaaaaa writes: aaaaa aaaaaa aaaaaaa aaaaaaaa aaaaaaaaa aaaaaaaaaa loop based example: zmain for i=$$^ScanAlnum() do . write i,! writes: now the time for

    7. $zzMax(gbl) or ^Max(gbl_vector)

      Computes and returns the maximum numeric value in the vector. For example:

      zmain for i=1:1:10 set ^a(99,i)=$r(100) set i=$zzMax(^a(99)) write "max=",i,!

      The above writes the largest value.

    8. $zzMin(gbl) or ^Min(gbl)

      Computes and returns the minumum numeric value in the vector. For example:

      zmain for i=1:1:10 set ^a(99,i)=i set i=$zzMin(^a(99)) write "min=",i,!

      The above writes 1.

    9. $zzMultiply(gbl1,gbl2,gbl3) or ^Multiply(gbl_matrix1,gbl_matrix2,gbl_matrix3)

      Multiplies the first and second matrix leaving the result in the third. For example:

      zmain set ^d("1","1")=2 set ^d("1","2")=3 set ^d("2","1")=1 set ^d("2","2")=-1 set ^d("3","2")=0 set ^d("3","2")=4 set ^e("1","1")=5 set ^e("1","2")=-2 set ^e("1","3")=4 set ^e("1","4")=7 set ^e("2","1")=-6 set ^e("2","2")=1 set ^e("2","3")=-3 set ^e("2","4")=0 set %=$zzMultiply(^d,^e,^f) for i="":$order(^f(i)):"" do . for j="":$order(^f(i,j)):"" do .. write i," ",j," ",^f(i,j),! writes: 1 1 -8 1 2 -1 1 3 -1 1 4 14 2 1 11 2 2 -3 2 3 7 2 4 7 3 1 -24 3 2 4 3 3 -12 3 4 0

    10. $zPerlMatch(string,pattern) or ^perlmatch(string,pattern)

      Applies the Perl pattern to string and returns 1 if the pattern fits and 0 otherwise. The perlmatch function has the side effect of creating variables in the local symbol table to hold backreferences, the equivalent concept of $1, $2, $3, ... in Perl. Up to nine backreferences are currently supported, and can be accessed through the same naming scheme as Perl ($1 through $9). These variables remain defined up to a subsequent call to perlmatch, at which point they are replaced by the backreferences captured from that invocation. Undefined backreferences are cleared between invocations; that is, if a match operation captured five backreferences, then $6 through $9 will contain the null string. See here for further details. Example:

      zmain write "Please enter a telephone number:",! read phonenum if $zperlmatch(phonenum,"^(1-)?(\(?\d{3}\)?)?(-| )?\d{3}-?\d{4}$") do . write "+++ This looks like a phone number.",! . write "The area code is: ",$2,! else do . write "--- This didn't look like a phone number.",! writes: Please enter a telephone number: (123) 456-7890 +++ This looks like a phone number. The area code is: (123) zmain write "Please enter a telephone number:",! read phonenum if $zPerlMatch(phonenum,"^(1-)?(\(?\d{3}\)?)?(-| )?\d{3}-?\d{4}$") do . write "+++ This looks like a phone number.",! . write "The area code is: ",$2,! else do . write "--- This didn't look like a phone number.",! writes: Please enter a telephone number: (123) 456-7890 +++ This looks like a phone number.

    11. $zReplace(string,pattern,replacement) or ^Replace(string,pattern,replacement)

      The regular expression in pattern is evalueted on string and, if there is a match, the matching section is replaced by replacement. Example:

      zmain set a="now is the time for all" set a=$$^Replace(a,"is","IS") write a,! set a="[[now is the time]]" if $zperlmatch(a,"(\[\[)(.*)(\]\])") do . set a=$$^Replace(a,$2,"ABC") . write a,! writes: now IS the time for all [[ABC]] zmain set a="now is the time for all" set a=$zReplace(a,"is","IS") write a,! set a="[[now is the time]]" if $zperlmatch(a,"(\[\[)(.*)(\]\])") do . set a=$zReplace(a,$2,"ABC") . write a,! writes: now IS the time for all [[ABC]]

      In the first part, the word 'is' is replaced by 'IS'. In the second part, a match is sought for any content between two sets of matching brackets ([[...]]). The matched section is in back reference $2. This is then used as a pattern to be replaced.

    12. $zzSum(gblVector) or ^Sum(gblVector)

      Computes and returns the sum of the numeric values in the vector. For example:

      zmain for i=1:1:10 set ^a(99,i)=i set i=$zzSum(^a(99)) write "sum=",i,!

      The above writes 55.

    13. $zzTranspose(gblMatrix1,gblMatrix2) or ^Transpose(gbl_matrix1,gbl_matrix2)

      Transposes the first global array matrix leaving the result in the second. For example:

      zmain kill ^f set ^d("1","1")=2 set ^d("1","2")=3 set ^d("2","1")=4 set ^d("2","2")=0 set %=$zzTranspose(^d,^f) for i="":$order(^f(i)):"" do . for j="":$order(^f(i,j)):"" do .. write i," ",j," ",^f(i,j),! yields: 1 1 2 1 2 4 2 1 3 2 2 0

    14. $zShred(a1,a2) or ^Shred(arg1,size)
      $zShredQuery(a1,a2) or ^ShredQuery(arg1,size)

      The ^Shred() function shreds the input string arg1 into fragments of length size upon successive calls. The function returns a string of length zero when there are no more fragments of length size remaining (thus, short fragements at the end of a string are not returned). ^Shred() copies the input string to an internal buffer upon the first call. Subsequent calls retrieve from this buffer. When the buffer is consumed, the fuction will copy the contents of the next string submitted to the buffer. Example:

      zmain set a="now is the time for all good men to come to the aid of the party" for do . set j=$zShred(a,5) . if j="" break . write j,! writes: nowis theti mefor allgo odmen tocom etoth eaido fthep

      The ^ShredQuery() function shreds size shifted copies of the input string arg1 into fragments of length size upon successive calls. That is, the function first returns all the size fragments of the string in the same manner as ^Shred(). However, it then shifts the starting point of the input string to the right by one and returns all the size length fragments relative to the shifted starting point. It repeats this process a total of size times.

      The function returns a string of length zero when there are no more fragments of length size remaining (thus, short fragements at the end of a string are not returned). ^ShredQuery() initially copies the input string to an internal buffer upon the first call. Subsequent calls retrieve from this buffer. When the buffer is consumed, the fuction will copy the contents of the next string submitted to the buffer. Example:

      zmain set a="now is the time for all good men to come to the aid of the party" for do . set j=$zShredQuery(a,5) . if j="" break . write j,! writes: nowis theti mefor allgo odmen tocom etoth eaido fthep owist hetim efora llgoo dment ocome tothe aidof thepa wisth etime foral lgood mento comet othea idoft hepar isthe timef orall goodm entoc ometo theai dofth epart isthe timef orall goodm entoc ometo theai dofth epart

    15. $zSmithWaterman(str1,str2,ShowAligns,ShowMat,Gap,MisMatch,Match) or ^SmithWaterman(str1,str2,ShowAligns,ShowMat,Gap,MisMatch,Match)

      Computes the Smith Waterman score between two strings. Result returned is the highest alignment score achieved. String lengths are limited by STR_MAX in the interpreter. If you compare very long strings (>100,000 character), you may exceed stack space. This can be increased under Linux with the command:

      ulimit -s unlimited

      For example:

      zmain set s1="now is the time" set s2="now i th time" set i=$zSmithWaterman(s1,s2,1,0,-1,-1,2) write "score=",i,! yields: 1 now- is the time 16 ::: :: ::: ::::: 1 now i- th time 16 score=23

      Parameters:

      If "show_aligns" is zero, no printout of alignments is produced. If "show_aligns" is not zero, a summary of the alternative alignments will be printed.

      If "show_mat" is zero, intermediate matrices will not be printed.

      The parameters "gap", "mismatch" and "match" are the gap and mismatch penalties (negative integers) and the match reward (a positive integer).

      If insufficient memory is available, a segmentation violation will be raised. Try increasing the stack size (see above).

    16. $zzIDF(global,doccount) or ^IDF(gbl_ref,docCount)

      Calculates the Inverse Document Frequency score of words in the gbl_ref vector. The paramater docCount is the total number of documents. Each element of the gbl_ref vector contains the number of times the term (the index value for this element) occurs in the collection. See http://www.cs.uni.edu/~okane/source/ISR/isr.html for further details. For example:

      zmain set ^a("now")=2 set ^a("is")=5 set ^a("the")=6 set ^a("time")=3 set j=4 set %=$zzIDF(^a,j) for i="":$order(^a(i)):"" write i," ",^a(i),! writes: is 0.7 now 2.0 the 0.4 time 1.4

    17. $zzTermCorrelate(a1,a3) or ^TermCorrelate(gbl_ref1,gbl_ref_2)

      Calculates the Term-Term co-occurence matrix (gbl_ref2) for The Document-Term matrix in gbl_ref1.

      A Term-Term matrix has terms (words) as the indices of its rows and columns. A Term-Term matrix gives, for each position, the degree to which the term corresponding to the row is similar to the term corresponding to the column. The diagonal, which is the degree a term is related to itself, is ignored.

      A Document-Document matrix has document id's as its row and column indices. A cell in the matrix indicates the degree to which the row document is related to the column document. The diagonal is ignored.

      In both the doc-doc and term-term matrices, the upper and lower diagonal matrices are mirror images of one another. For example:

      zmain kill ^A,^B set ^A("1","computer")=5 set ^A("1","data")=2 set ^A("1","program")=6 set ^A("1","disk")=3 set ^A("1","laptop")=7 set ^A("1","monitor")=1 set ^A("2","computer")=5 set ^A("2","printer")=2 set ^A("2","program")=6 set ^A("2","memory")=3 set ^A("2","laptop")=7 set ^A("2","language")=1 set ^A("3","computer")=5 set ^A("3","printer")=2 set ^A("3","disk")=6 set ^A("3","memory")=3 set ^A("3","laptop")=7 set ^A("3","USB")=1 set %=$zzTermCorrelate(^A,^B) for i="":$order(^B(i)):"" do . write i,! . for j="":$order(^B(i,j)):"" do .. write ?10,j," ",^B(i,j),! writes: USB computer 1 disk 1 laptop 1 memory 1 printer 1 computer USB 1 data 1 disk 2 language 1 laptop 3 memory 2 monitor 1 printer 2 program 2 data computer 1 disk 1 laptop 1 monitor 1 program 1 disk USB 1 computer 1 data 1 laptop 2 memory 1 monitor 1 printer 1 program 1 language computer 1 laptop 1 memory 1 printer 1 program 1 laptop USB 1 computer 1 data 1 disk 1 language 1 memory 2 monitor 1 printer 2 program 2 memory USB 1 computer 1 disk 1 language 1 laptop 1 printer 2 program 1 monitor computer 1 data 1 disk 1 laptop 1 program 1 printer USB 1 computer 1 disk 1 language 1 laptop 1 memory 1 program 1 program computer 1 data 1 disk 1 language 1 laptop 1 memory 1 monitor 1 printer 1

    18. $zzDocCorrelate(gbl_ref1,gbl_ref_2,method,threshold)
      ^DocCorrelate(gbl_ref1,gbl_ref_2,method,threshold)

      The square Document-Document matrix gbl_ref2 is calculated from the Document-Term matrix gbl_ref1 according to method (Cosine, Sim1, Dice, Jaccard). Elements inf the Document-Document matrix must exceed threshold.

      A Document-Document matrix has document id's as its row and column indices. A cell in the matrix indicates the degree to which the row document is related to the column document. The diagonal is ignored. For example:

      zmain kill ^A,^B set ^A("1","computer")=5 set ^A("1","data")=2 set ^A("1","program")=6 set ^A("1","disk")=3 set ^A("1","laptop")=7 set ^A("1","monitor")=1 set ^A("2","computer")=5 set ^A("2","printer")=2 set ^A("2","program")=6 set ^A("2","memory")=3 set ^A("2","laptop")=7 set ^A("2","language")=1 set ^A("3","computer")=5 set ^A("3","printer")=2 set ^A("3","disk")=6 set ^A("3","memory")=3 set ^A("3","laptop")=7 set ^A("3","USB")=1 set %=$zzDocCorrelate(^A,^B,"Cosine",.5) for i="":$order(^B(i)):"" do . write i,! . for j="":$order(^B(i,j)):"" do .. write ?10,j," ",^B(i,j),! which writes: 1 2 0.887096774193548 3 0.741935483870968 2 1 0.887096774193548 3 0.701612903225806 3 1 0.741935483870968 2 0.701612903225806

    19. $zStopInit(arg)
      ^StopInit(file_name)
      $zStopLookup(word)
      ^StopLookup(word)
      ^SynInit(file_name)
      $zSynInit(fileName)
      ^SynLookup(word)
      $zSynLookup(word)

      A call to ^StopInit(file_name)/$zStopInit(file_name) will open and load file_name of stopwords into a C++ container. The file should consist of one word per line. If the file cannot be opened or there is insufficient memory to hold the list of words, the program will halt with an error message. ^StopInit()/$zStopInit() convert all words to lower case. A call to ^StopLookup(word)/$zStopLookup(word) will return 1 if word is in the stoplist, 0 otherwise. Words presented to ^StopLookup(word)/$zStopLookup(word) should be in lower case. ^SynInit opens a synonym file. The file should consist of two or more words separated by one blank from one another. The first word on each line will be returned if any of the remaining words are sought by ^SynLookup. For example:

      zmain set %=$zStopInit("stop") if $zStopLookup("and") write "yes",! set %=$zSynInit("synonyms") write $zSynLookup("compressions"),! writes: yes compression

  3. $char(i1) or $char(i1,i2) or $char(i1,i2,...)

    $char translates numeric arguments to ASCII character strings. Numeric values greater that 128 will generate errors. For example:

    $char(65) yields "A" $char(65,66) yields "AB"

  4. $data(vn)

    $data returns an integer which indicates whether the variable vn is defined. The value returned is 0 if vn is undefined, 1 if vn is defined and has no associated array descendants; 10 if vn is defined but has no associated value (but does have descendants); and 11 is vn is defined and has descendants. The argument vn may be either a local or global variable. For example:

    set A(1,11)="foo" set A(1,11,21)="bar" $data(A(1)) ; yields 10 $data(A(1,11)) ; yields 11 $data(A(1,11,21)) ; yields 1

    $get(i1) or $get(i1,i2)

    Returns the value of the local variable specified as the first operand or a default value, specified as the second operand. If the second operand is omitted, an empty string is the default value.

  5. $extract(e1,i2) or $extract(e1,i2,i3)

    $extract returns a substring of the first argument. The substring begins at the position noted by the second operand. If the third operand is omitted, the substring consists only of the i2'th character of e1. If the third argument is present, the substring begins at position i2 and ends at position i3. Note that this differs from the usual SUBSTR function in PL/I. If only "e1" is given, the function returns the first character of the string "e1". If i3 specifies a position beyond the end of e1, the substring ends at the end of e1. For example:

    $extract("ABC",2) YIELDS "B" $extract("ABCDEF",3,5) YIELDS "CDE"

  6. $find(e1,e2) or $find(e1,e2,i3)

    $find searches the first argument for an occurrence of the second argument. If one is found, the value returned is one greater than the end position of the second argument in the first argument. If i3 is specified, the search begins at position i3 in argument 1. If the second argument is not found, the value returned is 0. For example:

    $find("ABC","B") YIELDS 3 $find("ABCABC","A",3) YIELDS 5

  7. $justify(e1,i2) or $justify(e1,i2,i3)

    $justify right justifies the first argument in a string field whose length is given by the second argument. In the two operand form, the first argument is interpreted as a string. In the three argument form, the first argument is right justified in a filed whose length is given by the second argument with i3 decimal places. The three argument form imposes a numeric interpretation upon the first argument. For example:

    $justify(39,3) YIELDS " 39" $justify("TEST",7) YIELDS " TEST" $justify(39,4,1) YIELDS "39.0"

  8. $len(e1) or $len(e1,e2)

    The $len function returns the string length of its argument. For example:

    $len("ABC") YIELDS 3 $len(22.5) YIELDS 4

    If a second argument is given, the function returns the number of non-overlapping occurrences of "e2" in "e1" plus 1.

  9. $name(vn[,count])

    The $name function returns the evaluated name of a variable with all or some of its subscripts. For example:

    set a=1,b=2,c=3,d=4 set x(1,2,3,4)=99 write $name(x(a,b,c,d),! write $name(x(a,b,c,d),99),! write $name(x(a,b,c,d),4),! write $name(x(a,b,c,d),3),! write $name(x(a,b,c,d),2),! write $name(x(a,b,c,d),1),! write $name(x(a,b,c,d),0),! will write: x(1,2,3,4) x(1,2,3,4) x(1,2,3,4) x(1,2,3) x(1,2) x(1) x

  10. $order(vn[,d])

    The $order() function traverses an array from one sibling node to the next in key ascending or descending order. The result returned is the next value of the last index of the global or local array given as the first argument. The default traversal is in key ascending order except if the option second argument is present and it evaluates to "-1" in which case the traversal is in descending key order. If the second argument is present and has a value of "1", the traversal will be in ascending key order. Numeric indices are stored in ASCII collating sequence order.

    Examples: zmain for i=1:1:9 s ^a(i)=i set ^b(1)=1 set ^b(2)=-1 write "expect (next higher) 1 ",$order(^a("")),! write "expect (next lower) 9 ",$order(^a(""),-1),! write "expect 1 ",$order(^a(""),^b(1)),! write "expect 9 ",$order(^a(""),^b(2)),! set i=0,j=1 write "expect 1 ",$order(^a(""),j),! write "expect 9 ",$order(^a(""),-j),! write "expect 1 ",$order(^a(""),i+j),! write "expect 9 ",$order(^a(""),i-j),! set i="" write "expect 1 2 3 ... 9",! for do . set i=$order(^a(i),1) . if i="" break . write i,! set i="" write "expect 9 8 7 ... 1",! for do . set i=$order(^a(i),-1) . if i="" break . write i,!

  11. $piece(e1,e2,i3) or $piece(e1,e2,i3,i4)

    The $piece function returns a substring of the first argument delimited by the instances of the second argument. The substring returned in the three argument case is that substring of the first argument that lies between the i3'th minus one and i3'th occurrence of the second argument. In the four argument form, the string returned is that substring of the first argument delimited by the i3'th minus one instance of the second argument and the i4'th instance of the second argument. If only two arguments are given, i3 is assumed to be 1. For example:

    $piece("A.BX.Y",".",2) YIELDS "BX" $piece("A.BX.Y",".",1) YIELDS "A" $piece("A.BX.Y",".",2,3) YIELDS "BX.Y"

    $piece can be used on the left hand side of a set command or as an argument in a read command. In these cases, the first argument must be a local or global variable. The contents of this variable are altered to the value of the right hand side of the set statement or the value read by the read statement. The entire contents of the local or global variable are not altered, only the part which would have been extracted by the $piece function.

  12. $qlength(e1)

    Returns the number of subscripts in the variable name. Examples:

    set i=1,j=2,k=3 set b(1)=99 w $ql("^a(i,j,k)"),! w $ql("^a(i,j,k)"),! w $ql("a(b(1),2)"),! write $ql("^a"),! the above write 3, 3, 2, and 0, respectively.

  13. $qsubscript(e1,e2)

    The $qsubscript function returns a portion of the array reference. Examples:

    set i=1,j=2,k=3 write $qsubscript("^a(i,j,k)",-1),! write $qsubscript("^a(i,j,k)",0),! write $qsubscript("^a(i,j,k)",1),! write $qsubscript("^a(i,j,k)",2),! write $qsubscript("^a(i,j,k)",3),!

    The above prints the empty string (environements are not defined in this implementation), ^a, 1, 2, and 3 respectively. Note: the variables or values of the subscripts must be valid.

  14. $query(e1[,c])

    The $query() function returns the next array element in the collating sequence. $query() takes a string as an argument rather than an actual global variable as is the case in the legacy Mumps standard. Return value: the next ascending entry in the global array data base. Index values in the returned string are normally quoted unless a second argument is given (see below). An empty string is returned when there are no more global arrays to return. If a second argument is given, its value replaces the quotes surrounding the returned index values. Note: this version of $query() returns the alphabetically next global in the global array file or the empty string. This means that when the elements of one global have all been returned, the elements of the next global will begin or the empty string will be returned if the end of the global array data base has been achieved. Consequently, you must test the returned value to determine if the returned results have moved on to a new global array. Examples:

    zmain # # assume the global array data base is completely empty. # for i=1:1:3 for j=1:1:3 for k=1:1:3 set ^a(i,j,k)=i set x=$query("^a(1)") write "example 0 ",x,! for do . set x=$query(x) . if x="" break . if $piece(x,"(",1)'="^a" break // $query() returns all globals . write "example 1 ",x,! set j=3 write "example 2 ",$query("^a(j)"),! // variables permitted write "example 3 ",$query("^a(2*j-j)"),! // expressions permitted set a="2+1" write "example 4 ",$query("^a(@a)"),! // indirection permitted set z="^a(1)" write "example 5 ",$query(@z),! // indirection permitted set z="^a(2)" write "example 6 ",$query(z),! for a="^a(1)":$query(a):"" do . if $piece(x,"(",1)'="^a" break // $query() returns all globals . write "example 7 ",a," -> " . write $qsubscript(a,0)," ",$qsubscript(a,1)," ",$qsubscript(a,2)," ",$qsubscript(a,3),! write ! // an alternative way of extracting index values for i="":$order(^a(i)):"" do . for j="":$order(^a(i,j)):"" do .. for k="":$order(^a(i,j,k)):"" do ... write "example 8 ",i," ",j," ",k,! writes: example 0 ^a("1","1","1") example 1 ^a("1","1","2") example 1 ^a("1","1","3") example 1 ^a("1","2","1") example 1 ^a("1","2","2") example 1 ^a("1","2","3") example 1 ^a("1","3","1") example 1 ^a("1","3","2") example 1 ^a("1","3","3") example 1 ^a("2","1","1") example 1 ^a("2","1","2") example 1 ^a("2","1","3") example 1 ^a("2","2","1") example 1 ^a("2","2","2") example 1 ^a("2","2","3") example 1 ^a("2","3","1") example 1 ^a("2","3","2") example 1 ^a("2","3","3") example 1 ^a("3","1","1") example 1 ^a("3","1","2") example 1 ^a("3","1","3") example 1 ^a("3","2","1") example 1 ^a("3","2","2") example 1 ^a("3","2","3") example 1 ^a("3","3","1") example 1 ^a("3","3","2") example 1 ^a("3","3","3") example 2 ^a("3","1","1") example 3 ^a("3","1","1") example 4 ^a("3","1","1") example 5 ^a("2","1","1") example 6 ^a("2","1","1") example 7 ^a("1","1","1") -> a 1 1 1 example 7 ^a("1","1","2") -> a 1 1 2 example 7 ^a("1","1","3") -> a 1 1 3 example 7 ^a("1","2","1") -> a 1 2 1 example 7 ^a("1","2","2") -> a 1 2 2 example 7 ^a("1","2","3") -> a 1 2 3 example 7 ^a("1","3","1") -> a 1 3 1 example 7 ^a("1","3","2") -> a 1 3 2 example 7 ^a("1","3","3") -> a 1 3 3 example 7 ^a("2","1","1") -> a 2 1 1 example 7 ^a("2","1","2") -> a 2 1 2 example 7 ^a("2","1","3") -> a 2 1 3 example 7 ^a("2","2","1") -> a 2 2 1 example 7 ^a("2","2","2") -> a 2 2 2 example 7 ^a("2","2","3") -> a 2 2 3 example 7 ^a("2","3","1") -> a 2 3 1 example 7 ^a("2","3","2") -> a 2 3 2 example 7 ^a("2","3","3") -> a 2 3 3 example 7 ^a("3","1","1") -> a 3 1 1 example 7 ^a("3","1","2") -> a 3 1 2 example 7 ^a("3","1","3") -> a 3 1 3 example 7 ^a("3","2","1") -> a 3 2 1 example 7 ^a("3","2","2") -> a 3 2 2 example 7 ^a("3","2","3") -> a 3 2 3 example 7 ^a("3","3","1") -> a 3 3 1 example 7 ^a("3","3","2") -> a 3 3 2 example 7 ^a("3","3","3") -> a 3 3 3 example 8 1 1 1 example 8 1 1 2 example 8 1 1 3 example 8 1 2 1 example 8 1 2 2 example 8 1 2 3 example 8 1 3 1 example 8 1 3 2 example 8 1 3 3 example 8 2 1 1 example 8 2 1 2 example 8 2 1 3 example 8 2 2 1 example 8 2 2 2 example 8 2 2 3 example 8 2 3 1 example 8 2 3 2 example 8 2 3 3 example 8 3 1 1 example 8 3 1 2 example 8 3 1 3 example 8 3 2 1 example 8 3 2 2 example 8 3 2 3 example 8 3 3 1 example 8 3 3 2 example 8 3 3 3

    Note that in the above, if the argument ^a(1) is given, the next value in the global data base is ^a(1,1,1) since this is the next higher actual stored key. The key ^a(1) is not a stored key in the above.

    If a second argument is given it must be a character string of length one. The character will become the delimiter between the fields of the result - that is, the character specified will replace the "(", "," and ")" tokens in the returned string. For example:

    zmain for i=1:1:3 for j=1:1:3 for k=1:1:3 set ^a(i,j,k)=i set x=$query("^a(1)","#") write x,! set j=3 write !,$query("^a(j)","#"),! set ^a(1)="^a(2)" set z="^a(1)" write $query(@z,"#"),! write $query(z,"#"),! write: ^a#1#1#1# ^a#3#1#1# ^a#2#1#1# ^a#1#1#1#

    If you use this format, note that you can not resubmit the returned global array reference to the function for another higher index. Only valid format global array references may be submitted as a first argument to the function.

  15. $random(i1)

    $random returns an integer in the range zero through i1-1. For example: $random(100) yields a value between 0 and 99

  16. $reverse(i1)

    The $reverse() function reverses the order of the characters in the argument string. Example:

    set x="now is the time" write $reverse(x),! the above writes "emit eht si won"

  17. $select(t1:e1,t2:e2,...tn:en)

    The $select function takes a variable number of arguments delimited by commas. Each argument consists of two parts: a logical expression and a result expression. The function evaluates in sequence each of the logical expressions (shown above as t1, t2, ...tn - note: these can be any expression in reality: a zero result is called false and a non-zero result is called true). If a logical expression is true, the result expression (e1, e2, ... en) is evaluated and becomes the value for the function. In the compiler, there is a maximum of ten argument pairs permitted.

  18. $text(L1) or $text(L1+/-I2) or $text(I1)

    $text() extracts text values from the source code. It only works on lines or original source code that began, after the line start character, with two semi-colons. The operand may be a label, a label plus or minus an offset or a offset only. All reference must be in the same file as is currently being compiled. An offset without a label is an offset from the first line of the file.

  19. $translate(S1,S2) or $translate(S1,S2,S3)

    If only two strings are given, characters appearing in the second string are removed from the first string. If three strings appear and the second and third string are of equal length, characters from the first string appearing in the second string are replaced by their counterparts from the third string. If the second string is longer than the third string, the characters from the second string which have no counterpart in the third string are removed. By "counterpart" we mean a character equally offset in the third string to the character in the second string.

  20. $view

    $view is not supported.

  21. $z...

    $zfunctions are extensions added by the implementor. A user may add new $Z functions by modifying the zfcn() function. The Mumps compiler has the following $Z functions:

    Math Functions

    The following functions are available. Their arguments and return values are the same as the corresponding C++ functions.

    • $zacos(arg) - computes the inverse cosine (arc cosine) of the input value. Arguments must be in the range -1 to 1.
    • $zasin(arg) - computes the inverse sine (arc sine) of the argument arg. Arguments must be in the range -1 to 1.
    • $atan(arg) - computes the inverse tangent (arc tangent) of the input value.
    • $zcos(arg) - `cos' computes the cosine of the argument arg. Angles are specified in radians.
    • $zexp(arg) - calculates the exponential of arg, that is, e raised to the power X (where e is the base of the natural system of logarithms, approximately 2.71828).
    • $zlog(arg) - Returns the natural logarithm of arg, that is, its logarithm base e (where e is the base of the natural system of logarithms, 2.71828...).
    • $zlog10(arg) - returns the base 10 logarithm of arg.
    • $zpow(arg1,arg2) - calculates arg1 raised to the exponent arg2.
    • $zsin(arg) - computes the sine of the argument arg. Angles are specified in radians.
    • $ztan(arg) - computes the tangent of arg.

    The following special purpose functions are available:

    1. The $zab(arg) function returns the absolute value of its numeric argument.

    2. The $zb(arg) function returns a string in which leading and all multiple blanks have been replaced by single blanks.

    3. The $zcd[(filename)] function dumps the globals to a sequential ASCII file in the current directory. If an argument is given, it is taken as the name of the file to which the globals will be written. If the argument is omitted, a file name is constructed from the system date of the form number.dmp where number is the high order 8 digits of the value of the C++ time() function at the time of the dump.

      The dump file is ASCII text. Each entry in the global array is represented by two lines. The first line is the global array reference and the second line is the store value. In the global array reference, parentheses and commas are replaced by the "~" character.

    4. The $zchdir(directory_path) function changes the current directory to the path specified. If the operation succeeds, a zero is returned. If it fails, -1 is returned.

    5. The $zcl function restores the globals from the file named dump. Create this file by using the $zcd function and renaming the output file to dump. Please note this name. It is different than the names used by the dump function.

    6. The $zdate (or $ZD ) function returns the system date and time in standard system printable format. This includes: day of week, month, day of month, time (hour:minute:second), and year (4 digits).

    7. The function $zd1 returns the number of seconds since January 1, 1970 - a standard used in Linux. This number may be used to accurately correlate events.

    8. The function $zd2(InternalDate) translates the Linux time from $ZD1 into standard system printable format. The argument is a Linux format time value.

    9. The function $zd3(Year,Month,Day) returns the day of the year (Julian date) for the Gregorian date argument.

    10. The function $zd4(Year,DayOfYear) returns the Gregorian date for the Julian date argument.

    11. The function $zd5(Year, Month, Day) returns a string consisting of the year, a comma, the day of year, and the number of days since Sunday (Monday is 1).

    12. The function $zd6 returns a string consisting of the hour, a colon, and the minute.

    13. The function $zd7 returns a string consisting of the year, hyphen, month, hyphen, and day of month. If an argument is given in the form of the number of seconds since Jan 1, 1970, the result returned will reflect the argument date.

    14. The function $zd8 returns a string consisting of the year, hyphen, month, hyphen, and day of month, comma, and time in HH:MM format. If an argument is given in the form of the number of seconds since Jan 1, 1970, the result returned will reflect the argument date.

    15. The function $zd9 returns a string consisting of the year, hyphen, month, hyphen, and day of month, comma, day of week and time in HH:MM format. If an argument is given in the form of the number of seconds since Jan 1, 1970, the result returned will reflect the argument date.

    16. The $zf(arg) function returns a zero or one indicating if the file given as the argument exists.

    17. The $zg function gives a profile of the global file systems. It returns 6 values, separated by blanks, that give the the address of the file system root block, the size of the DATA file, the size of the KEY file, the number of internal buffers in use, the file system mask in hexadecimal and the version number. The files are limited to 2 gigabytes each at present.

    18. The $zflush function flushes all modified native global array handler buffers to disk. The function should only be used with the native globals. After flushing, all updates to the btree file system have been committed. In cases where the internal buffers are very large, this function may take several seconds to execute. The function returns the empty string. Flushing the buffers is a precaution against system failure which would otherwise result in corruption of the global arrays.

    19. The $zh(arg) function encodes its argument in the form necessary to be a cgi-bin parameter. That is, alphabetics remain unchanged, blanks become plus signs and all other characters become hexadecimal values, preceded by a percent sign.

    20. The $zhit function calculates and returns the native global array cache hit ratio. This number ranges between zero and one. A value of one indicates all requests were satisfied from the cache while a value of zero indicates no requests were satisfied from the cache. Calling this function resets the hit ratio to zero.

    21. The $zm(global) function returns the next row of the global array matrix.

    22. The $zlower(string) function returns the input string with alphabetics converted to lower case.

    23. The $zn(arg[,arg2]) function normalizes word content for use with information storage and retrieval systems. It converts the word passed as an argument to lower case and removes non-alphabetics. The result is returned as the value of the function. If a second argument is given, the word is truncated to the numeric value of the argument. If no second argument is given, words are truncated to 25 characters if their length exceeds 25 characters. This function is used with document indexing applications.

    24. The $zp(arg1,arg2) function left justifies the first argument in a string whose length is given by the second argument, padding to the right with blanks.

    25. The $zr(arg) function returns the square root of its numeric argument.

    26. The $zs(arg) function takes one argument string which it passes to a shell for execution. Output is sent to standard out.

    27. The $zseek(arg) function takes one argument (a positive integer) which is a byte offset in the currently active (use) file. The command moves the file pointer to that location in the file. The function works with files longer than 2GB if the compiler was configured (see configure) for large file sizes.

    28. The $zsqr(arg) function return the square of its numeric argument.

    29. The $zstem(arg) return the word english word stem of the argument. This function attempts to remove common endings from words and return a root.

    30. The $zsystem(arg) Executes "arg" in a system shell. Returns -1 (fork failed) or the return code of the argument.

    31. The $ztell function returns the byte offset in the currently open file. Similar to the C++ ftell() function. This function works with files larger than 2GB. Note: The offset returned is for the file most recently made the default i/o file by the use command.

    32. The $zu(expression) function returns 1 if the expression is numeric, 0 otherwise.

    33. The $zwi(arg) function loads an internal buffer with the string given as the argument. The contents of this buffer are returned by the $zwn and $zwp functions.

      zmain set i="now, is the time, for all good" do $zwi(i) for w=$zwp write w,! write "-------",! do $zwi(i) for w=$zwn write w,! writes: now , is the time , for all good ------- now, is the time, for all good

    34. The $zwn function returns successive words from the internal buffer delimited by blanks. When no more words remain, it returns an empty string (string of length zero). Returned words are converted to lower case. See example.

    35. The $zwp function returns successive words from the internal buffer delimited by blanks and punctuation characters. When no more words remain, it returns an empty string (string of length 0). Returned words are converted to lower case. See example.

    36. The $zwp(string) initializes the parse buffer but does not convert "string" to lower case as is the case with $zwi()

    Built-in Variables

    1. $horolog (Abbreviation $h) - The $H built-in variable returns a string consisting of two numbers. The first is the number of days since December 31, 1840 and the second is the number of seconds since the most recent midnight. The variable may not appear as the target of an assignment or read command. These values are relative to Greenwich Mean Time.

    2. $io (Abbreviation: $i) - $io gives the current unit number. Mumps I/O is, at any given time, directed to a given unit number. In Mumps, i/o, by default, is directed to unit 5 - the user's console. This is the unit from which all read's and write's will take place. If the user open's another unit number for further file operations, the use command is used to redirect the read and write commands to this unit. The $io variable indicates the current i/o unit number. It may not appear as the target of an assignment statement or as an argument of a write command although it may appear in both contexts as a source argument such as in computation of an index of a target array.

    3. $job (Abbreviation $j) - The $job variable returns the system job number. This is the process PID.

    4. $test (Abbreviation $t) - The $test variable reflects the status after certain commands. Generally speaking, it is set to one (1) when the command succeeds and to zero (0) when the command fails. The value in $test remains until it is changed. Examples of commands that set $test are: read, open, if, and lock. In compiled code, the value of $test on entry to a block is restored on exit.

    5. $storage (Abbreviation $s - Always returns 999.

    6. $x - The $x variable gives the current horizontal position of the record in the current unit number. For terminals, this is the horizontal cursor position. For other files, it is the number of characters since the start of the current record.

    7. $Y- The $y variable gives the vertical position of the current unit number. It is pre-set to zero for each top of forms format control used.


    Installing the Compiler

    The compiler is supplied as both a compressed gzipped tar file (for Linux) and a zip file (for DOS). Uncompress and untar or unzip the distribution. Do not attempt to use the Linux distribution with DOS as it will not work correctly. In MS Windows, the unzip will create a directory named mumpsc in the root directory of the C: disk drive. The zip file consists primarily of executables while the Linux distribution contains the full source code. In Linux, the untar will create a directory named mumpsc in the directory from which you perform the untar. The following is an example of the untar/ungzip for Linux:

    tar xvzpf mumpscompiler-9.00.src.tar.gz

    Other Software Needed

    In addition to the software provided in this package, you will need additional software which is probably included in your Linux distribution disks. You should install:

    1. Required: The Perl Compatible Regular Expression Libraries which are used to support the pattern match operator (libpcre0 and libpcre0-devel) (http://pcre.org).
    2. Optional: The Berkeley Data Base to handle globals (db4, db4-devel, and db4-utils or higher). Note: you probably have db1 installed as it is used by several system functions. You will need BDB 4.3 or later, however. The present release is keyed to the most recent BDB data base release. If the BDB is not present, you must use the native global array system. This will be made the default by "configure" on Linux systems. The system was developed and tested with version 4.3.28.

    Note: the new C++ routines require a recent version of the C++ compiler that incorporates changes to the language standard made in 1999. Some Linux distributions may not have the most recent version of the g++ compiler. The code in this distribution was compiled using g++ version 3.4.3.

    MS Windows XP:

    You may either build the compiler from source code (this requires Cygwin and the MS Visual C/C++ compiler) or download the binaries (requires only the MS Visual C/C++ compiler. You may also just download the binary executable interpreter which requires no additional software.

    A free copy of the MS Visual C++ command line compiler is available from:

    http://msdn.microsoft.com/vstudio/express/visualC/default.aspx

    Cygwin is available from:

    http://www.cygwin.com/

    If you download the binaries, when the Windows version is unziped, the distribution will reside in a directory named mumpsc in the root directory of your C: disk. The mumpsc directory will contain the manual ("compiler.html"). See the appropriate INSTALL file for details. The *.zip distributions contain binary executables, header files and documentation. Full source code is in the *.tar.gz files.

    1. MS Visual C++ Binary Install

      The Microsoft Visual C++ version of Mumps/MDH installs to c:\mumpsc when you unzip the distribution. This version contains both the compiler and MDH toolkit. It also contains modified pcre routines and header files.

      You must install a command line version of the MicroSoft Visual C/C++ compiler. A free version of this compiler is available at:

      http://msdn.microsoft.com/vstudio/express/visualC/default.aspx

      The Mumps zip file distribution contains pre-built executable versions of the Mumps Compiler and run-time libraries. Unzip the distribution to your C:\ drive so that the files are located in the directory 'mumpsc' (this will be the default). The distribution will build a directory tree under 'mumpsc'. In this directory tree are the libraries, executables, some source files and documentation.

      Important: Once you have unzipped the files on C: drive, add c:\mumpsc\bin to your PATH:

      path=%path%;c:\mumpsc\bin;

      You can make this permanent by going to:

      Start | Settings | Control Panel | System

      and then clicking on the 'Advanced' tabe then on "Environment Variables" Edit the environment variable named 'Path' and add ';c:mumpsc\bin' at the end. Be carefull when editing this as you could cause serious problems if this variable becomes corrupt.

      To compile a mumps program with the native globals data base, open a command prompt for the Visual C/C++ compiler which should be accessible as:

      Start | Programs | Visual C++ 2005 Express Edition | Visual Studio Tools | Visual Studio 2005 Command Prompt

      and type mumpsc xxx.mps

      where "xxx.mps" contains the mumps program. The excutable will be "xxx.exe"

      To compile a C++ program that uses the MDH Toolkit:

      mumpsc zzz.cpp

      The stack size of running programs can be an issue. For WindowsXP, the stack size is set in the file "mumpsc.bat" in the batch file variable "STACK". It is presently set at 10,000,000 but may be adjusted higher or lower, as needed. Some programs, most notably the Smith-Waterman alignment procedure, may recurse many times and this may cause stack overflow unless the stack size is increased.

    2. MS Visual C++ Full Build Details

      To re-build the Windows XP compiler, you need Cygwin and the MS VC compiler.

      The build routine is "MSVCbuild.bat". This routine is included in the source distribution:

      mumpscompiler-x.xx.src.tar.gz

      which untars/ungzips under the directory 'mumpsc'. You should untar/unzip this file under cygwin in your cygwin home directory.

      The build bat file should be run from this directory from a Windows XP command prompt whose PATH and other environment variables have been set to access the Microsoft compiler. (See above regarding an MS Visual C++ command prompt).

      As the Linux/Cygwin 'configure' iprogram does not work under Windows XP in native mode, you must run configure under Cygwin. This will configure the source C and C++ files for the MSVC environment.

      To configure for MSVC, under Cygwin, type:

      configure --with-msvc

      This will properly configure the source files.

      Note: if you do this under Linux, some functions may be incorrectly specified.

      To use the Berkeley Data Base,however, you must download and install the Berkeley package from

      http://www.sleepycat.com.

      The BDB is distributed under its own license which is different than the GNU GPL/LGPL. You should read and understand the Berkeley license before you use this data base.

      The file "db.h" is included in the main source distribution. It is located in include and renamed to be "db.x". This file is part of BDB 4.2.52 and is distributed to assist in compiling applications using BDB.

      You should update this file as needed for your version or rename the one provided to "db.h" before you build the system. Otherwise, you will see an error message whrn the BDB support functions are being built. You may ingnore these errors if you do not plan to the the BDB.

      Do _NOT_ rename db.x if you are building the system for Linux or Cygwin.

      The BDB file "libdb42.dll" is built by the BDB install package should be copied to your c:\windows directory. The file "libdb42.lib" also built by the BDB install package should be copied to c:\mumpsc\lib. NOTE: one file name ends in "lib" and the other in "dll" - don't confuse them!

      The batch MSVCBuild.bat compiles the software, builds mumpslib.lib, and copies the relevant files to "c:\mumpsc\". Besides the MSVC++ compiler and its compile and runtime librarues, you need the MicroSoft lib.exe program to create libraries. File lib.exe is not part of the free distribution.

      The Mumps Compiler (mumps2c.exe) and the bat files to compile and run programs are copied to "c:\mumpsc\bin". This directory must be added to your PATH variable as shown above. The bat files in the directory "mdh.bat" and "mumpsc.bat" are hardcoded to look for libraries and include files in "c:\mumpsc\". If you move or rename this directory, you must modify these files as well as mumpsc/MSVCbuild.bat.

      It is tricky to configure pcre 4.5 for Windows. Thus, provided with the distribution are object files for pcre.c and get.c (named get.obj and pcre.obj) from the pcre 4.5 distribution. Note also that the pcre 4.5 pcre.h file is in mumpsc/include. These files can be replaced with newer versions as needed. The pcre is covered by a separate license given elsewhere in this distribution.

      The c:\mumpsc directory contains:

      
      c:\mumpsc\bin            executables and scripts
      c:\mumpsc\lib            object libraries
      c:\mumpsc\include	header files
      c:\mumpsc\doc            documentation
      c:\mumpsc\src            source code
      
      Do not enable the optimization switch /O2 or other values that optimize for speed as they have caused problems with pattern matching.

    3. Windows XP Apache Web Server Interface

      If you want to use and test web based software, you will need a copy of the Apache Server for Windows XP. This can be downloaded from

      http://www.apache.org/dist/httpd/binaries/win32/#released

      Find the latest file of the form:

      http://www.apache.org/dist/httpd/binaries/win32/apache_2.2.3-win32-x86-no_ssl.msi

      download it then double click on it (it will initiate the self install procedure).

      To run programs through the web server, start the server (see Apache documentation). You will place your scripts in the default 'cgi.bin' directory which should be located at:

      C:\Program Files\Apache Software Foundation\Apache2.2\cgi-bin

      Copy to this directory the file 'cgi.mps':

      #++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ #+ Mumps Compiler Run-Time Support Functions #+ Copyright (c) 2006 by Kevin C. O'Kane #+ okane@cs.uni.edu #+ #+ This library is free software; you can redistribute it and/or #+ modify it under the terms of the GNU Lesser General Public #+ License as published by the Free Software Foundation; either #+ version 2.1 of the License, or (at your option) any later version. #+ #+ This library is distributed in the hope that it will be useful, #+ but WITHOUT ANY WARRANTY; without even the implied warranty of #+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU #+ Lesser General Public License for more details. #+ #+ You should have received a copy of the GNU Lesser General Public #+ License along with this library; if not, write to the Free Software #+ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA #+ #+ http://www.cs.uni.edu/~okane #+ #++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ zmain +#include <mumpsc/cgi.h> + if (argc==2) $SymPut("%",argv[1]); + else $SymPut("%",""); for i=1:1:10 set j=$p(%,"/",i,i+1) if $p(%,"/",i+1)="" quit x "do ^"_j halt

      Compile this program with the Mumps Compiler. This creates a program which will become the executing shell for your actual script. It will be named cgi.exe.

      To write an executable script, create a file with the '.cgi' extension such as:

      #!/Program Files/Apache Software Foundation/Apache2.2/cgi-bin/cgi write "Content-type: text/html ",!! write "<html><body>" write "Hello World<p>" for i=1:1:10 write i,"<br>",! write "</body></html>" halt

      The script, when invoked by the server, invokes the shell (cgi.exe) which parses the name of the script file you want to run and executes it. Name the script above "a.cgi".

      Pay close attantion to the line:

      write "Content-type: text/html ",!!

      If this line is not correct, the page will not display. Also note that blanks only appear at the beginning of lines for the interpreter. Do not use characters. The URL will be:

      http://127.0.0.1/cgi-bin/a.cgi

      Where 'a.cgi' is the name of the script file immediately above. The results appear as:

    Linux and Cygwin Installs:

    Download and un-tar/un-gzip the distribution:

    http://math-cs.cns.uni.edu/~okane/cgi-bin/newpres/m.compiler/compiler/index.cgi

    A a directory named mumpsc will be created by the un-gzip/un-tar procedure.

    Linux/Cygwin Quick Root Install

    For the impatient, login as root and try the following:

    Linux:

    tar xvzf mumpscompiler-9.01.src.tar.gz
    cd mumpsc
    ./configure prefix=/usr
    make
    make install

    use:

    ./configure prefix=/usr --with-cpu64

    if you have a 64bit processor.

    Cygwin:

    tar xvzf mumpscompiler-9.01.src.tar.gz
    cd mumpsc
    ./configure prefix=/usr
    make
    make install

    use:

    ./configure prefix=/usr --with-cpu64

    if you have a 64bit processor.

    The commands "mumps" and "mumpsc" should now be active.

    Full Linux/Cygwin Installation

    Installation is a two step process:

    • compile and link the modules
    • copy the modules to target directories

    First, run the autoconfig script:

    ./configure

    This will test your configuration and provide any warnings concerning missing libraries. See configure.ac for autoconfig source details. The "configure" script has many options, all of which have default values. You may override these defaults with parameters to the "configure" command.

    By default, the compiler will be built in your home directory in a directory named "mumps_compiler" (~/mumps_compiler).

    To build in other directories, use a command such as:

    ./configure prefix=/usr

    This will build the installation in the system directories (/usr/include, /usr/bin, /usr/lib, etc). You must be root to do this.

    If you want the Berkeley Data Base facility to be the default global array file handler, add the parameter "--with-bdb" to the "configure" command:

    If you want to take advantage of very large files and very large native global arrays (file sizes greater than 2 gigabytes), you need to enable large file support:

    ./configure prefix=/usr --with-native

    The native globals btree has an internal cache. Each block in the cache is the same size as the btree block (default: 1025). By default, the cache is set to 9. This may be overridden by "configure":

    ./configure prefix=/usr --with-native --with-cache=1000

    where the number specified (1000 in this case) will be the number of blocks cached. This value may not be less than 9 and must be selected from a list of values (see table below).

    ./configure --with-cpu64

    the "--with-cpu64" option enables code for 64 bit processors using gcc/g++ compilers enabled to generate 64 bit code. This option has been tested using the to a limited degree on Intel Xeon64 processors. The --with-cpu64 causes compiles to look for libraries in /usr/lib64.

    The "configure" file (see "configure.ac") has been developed with Red Hat and Debian releases. It has had limited testing with Apple OS/X. It is not suitable for DOS systems.

    configure options
    OptionMeaningDefault
    --with-cpu64 enables 64 bit code generation 32 bit code.
    --with-msvc Builds modules so they can be compiled with the MicroSoft Visual C++ compiler. not enabled
    --with-djgpp Builds modules for compilation with the DJGPP compiler. The --file-64 option must be omitted with this option. not enabled
    --with-native Makes native globals the default global file handler. Native data base
    --with-bdb Make Berkeley Data Base the default global file handler. Native data base.
    --with-includes=DIR Used to identify header dirs for Apple port. N/A
    --with-libraries=DIR Used to identify library directories to Apple port. N/A
    --with-cache=VAL Native globals cache size. 1025. Note: the permitted values for this field are: 9, 17, 33, 65, 129, 257, 513, 1025, 2049, 4197, 8193, 16385, 32769, 65537, 131073, 262145, 524289, or 1048577.
    --with-ibuf=VAL Maximum text size of an interpreted program. 20000. Maximum value for this field is 32000 and the minimum value is 4096. Note: in order to run "checkout.mps" in the distribution, this value needs to be at least 20000.
    --with-strmax=VAL Maximum internal string size (STR_MAX) 4096.
    --with-block=VAL Native btree block size. 1024. Note: this value must be one of: 1024, 2048, 4096, 8192, 16384, 32768, 65536, 131072, 262144, 524288, 1048576, or 2097152. The size of the btree block determines the maximum file size. A block size of 1024 permits the key file to grow to 2 terabytes. A block size of 2048 would permit the key file to grow to 4 terabyte, a block size of 4096 permits a 16 TB max key file size and so on (doubles at each step). Larger blocks, however, can result in slower performance. You should select the smallest block size possible for your application. A block must be capable of storing a key plus about 30 bytes of overhead.

    The "configure" script will generate warnings and/or error messages if certain required modules are missing.

    To install for Red Hat based Linux distributions (such as Red Hat and Mandrake), type:

    make

    then, as root if your are installing to system directories, type:

    make install

    To compile a Mumps program type:

    mumpsc myprog.mps

    and the result will be "myprog.cgi" if you followed the syntax rules in "compiler.html".

    Note: if you allowed "configure" to install the compiler files to the default "$HOME/mumps_compiler" directory, you should add $HOME/mumps_compiler/bin" to your "PATH" variable.

    Implementation specific notes

    Some versions of Linux (e.g., Debian) install some libraries under different names and in different directories.

    The install will place several "man" pages on your system. These are:

    1. mumpsc

    If you are going to use the Berkeley DB, you need to be sure that it has been installed on your system. If not, install it before you install the Mumps Compiler. It generally goes by the name db3 or db4 on your distribution disks. The Mumps Compiler will not work with anything prior to db3 and there may be problems if you have an early version of db3. Please try to use the latest version available. If you receive an error message concerning the function stat on or about line 74 in global.c, alter the function call to remove the next to last argument (NULL). This problem is caused by some versions of the Berkeley DB having a different calling structure than others. Be aware that some systems install DB3 and DB4 to directories other than those for which the system was written. If you get error messages listing unknown header or library files, this may be your problem.

    Each new release of the BDB seems to change important API calling parameters. We attempt to keep the current Mumps release in compliance with the most recent BDB release and their changes but this is not always possible. Consequently, you may experience errors when compiling global.c as this is the interface with the BDB. This may be particularly true if you are using a release of the BDB prior to Version 3.2.9.

    You may now try the checkout code. If you have installed hte software locally to you directory, be certain to adjust the PATH variable as noted above and re-logon.

    Move to the sub-directory mumpsc/doc/examples and compile the checkout routine:

    mumpsc checkout.mps

    This will compile the Mumps program "checkout.mps". Please note: this is "checkout.mps", not "checkout1.mps". The file "checkout1.mps" is invoked by "checkout.mps" and contains code in a format suitable for the compiler (i.e., it has a "zmain" command).

    The default compilation uses the native global array file system. You may request the BDB global array file system by:

    mumpsc -b checkout.mps

    Alternatively, if you have made the BDB file system the default (with "configure") you may force the native file system with:

    mumpsc -n checkout.mps

    Present in the examples directory along with checkout.mps is a file named checkout1.mps which is essentially similar. Do not compile this file. It will be interpreted by checkout.mps as a result of an Xecute command. To run the compiled program, type:

    checkout.cgi

    many lines will flash by. It will pause and ask you for input: type a few random characters. As the last test, the compiled program will execute (via the Xecute command) the file checkout1.mps. This file is similar to checkout.mps but used for testing the interpreter. It too will ask for some random characters. Finally, the interpreted file will finish and the compiled file will end.

    There are also validation routines in the directory "validate" that may be compiled and excuted to test your installation.

    There are two global array file systems. The default is the native global array handler. The other system is referred to in this manual as the Berkeley Data Base (BDB) array file system. The native global array file system will produce data files named key.dat and data.dat while the Berkeley DB will produce a file named Mumps.DB as well as several files each of whose names begins with the prefix "__db". Erase these files should you re-run the test code as residual data in them will cause subsequent tests to fail.

    Multiple compiled Mumps programs may be executed concurrently with the Berkeley DB. The underlying Berkeley DB software will arbitrate disk accesses in the event of two or more concurrent attempts to store into the data base. The native global array handler, on the other hand, will take exclusive control of the data base for the duration of a program's execution. Other programs will pend waiting for the global arrays to be come available.

    You may want to check the preprocessor defined variables in include/mumpsc. These may need to be set for your operating system and for other options. These files contain operating system and environment specific settings in preprocessor defined variables located near the beginning of the file. Each contains a comment as to its optional settings. The default settings should be adequate for most purposes.

    If you need to debug C++ programs generated by the Mumps Compiler, you may compile from the C++ source by:

    mumpsc myprog.c

    where "myprog.c" was created from a Mumps program named "myprog.mps" by the compiler. This will enable you to modify the generated C++ program and still compile it with all the options and libraries required by Mumps.

  22. There are a set of checkout routines in the directory "validate" that can be used to verify your installation and as examples of the dialect of mumps accepted by the compiler.

Compiling Programs

Write your Mumps programs with any standard ASCII editor such as vi or emacs. If you use a word processor, be certain that you save the file as ordinary text, without embedded control characters. Note: some word processors embed non-printing ASCII characters in the saved text. These will cause problems. Be sure to save the file in a form that does not contain embedded non-printing characters.

The format of a line of Mumps code is: an optional label followed by one or more blanks followed by the first command word of the line. Labels must begin in column 1, if one is present. For compiled code, blanks are preferred to tab characters but both may be used. For interpreted code, blanks are required (that is, tab will not work). If you need to have the interpreter recognize tab characters at the beginning of a line instead of blanks, change the TAB defined symbol in interp.h to \t and and recompile. Your program may include blank lines which are ignored.

The file extension of your Mumps source programs should be .mps.

To compile a Mumps program named myprog.mps, type the following:

mumpsc myprog.mps This will generate myprog.c which will be subsequently compiled to the binary executable named myprog.cgi. The default compilation will be for native globals in standalone mode.

Use of the Berkeley DB in standalone mode can be explicitly stated with one of the following commands:

mumpsc -b myprog.mps mumpsc --berkeley myprog.mps

To compile for the native globals, use one of the following commands:

mumpsc -n myprog.mps mumpsc --native myprog.mps

The "n" or "native" option uses the large file system capable version of the native globals. This version must be run on a file system that supports files larger than 2 gigabytes (for example, ext3, xfs and several others).

In "native" mode the data base is stored in two files: "key.dat" and "data.dat". These may reside on different drives. The file "key.dat" contains the btree of globals and "data.dat" holds any associated data strings. The "key.dat" file can grow to multiple terabytes terabytes bytes and the "data.dat" file can grow to 2**64 bytes (64 bit addressing). Key.dat contains the b-tree and data.dat contains stored data.

The compiler (named mumps2c) may generate a few local header files that will be used during the C++ program compilation. They are generally of the form xxx.m. They may be safely deleted after compilation.


Global Arrays

There are, since version 4, two forms of global arrays: those based on the original Btree package developed as part of this project and those based on the Berkeley Data Base (http://www.sleepcat.com). The former will be called the native global arrays and the later, the BDB global array data base in this document. Native globals are the current default.

Full documentation concerning the BDB is available from http://www.sleepycat.com. The BDB is licensed separately from its distributor. The BDB license is similar to the GPL but has important differences. You should read the license to see if it suits your application.

Mumps programs may be compiled for the BDB with one of the scripts listed above. If you use the BerkeleyDB, you must also install the Berkeley DB libraries on your machine, version 3.1.17 or later.

The BDB will create a file named Mumps.DB and several files with the prefix "__db." in the directory from which the Mumps program is run unless they already exist. The name and location of this file can be altered in the global.h header file.

Native global arrays can be either permanent or temporary. If temporary, they are created when a translated Mumps program first accesses them and destroyed upon program termination. The global arrays are temporary if the NEW_TREE preprocessor variable is set to 1 in sysparms.h. Temporary global arrays are useful if your data are loaded each time you run your program or if your permanent data base resides on server. Temporary global file names are constructed with a value based on the program's process id.

Permanent global arrays, the default, continue to exist after a program terminates. Global arrays will be permanent if the NEW_TREE preprocessor variable in sysparms.h is set to 0. Only permanent global arrays are possible with the BDB.

Native global arrays reside in two files. The names of these files can be set at compile time or run time by altering the values in the C++ variables "cfgdata" and "cfgkey". The filenames in these, including path names, will be used to open the global array files. The default names are are "data.dat" and "key.dat". The default names are in "btree.h" in the defined variables "UDAT" and "UKEY".

To change the native global array file names at run time, you must change them prior to any global array access. You do this by including two embedded lines of C++ code in your Mumps program. The following is an example of how to change the file names at run time:

zmain + strcpy(svPtr->cfgdata,"mydata.dat") + strcpy(svPtr->cfgkey,"mykey.dat") Set ^a(1)="test" . . .

The C++ function strcpy is used to change the values in the C++ variables svPtr->cfgdata and svPtr->cfgkey. The pointer svPtr contains the address of your program's state vector and is used whenever access to the state vector is required. The state vector is ordinarily not accessed but its definition is in include/mumpsc/stateVector.h. The above must be done before the first reference to a global array. You may select any names supported by your system. You must include these lines prior to the first usage of the global arrays. The global array handler uses the contents of these variables when it opens the files.

You may want to place the files on different disks to improve performance. One ("key.dat") contains the btree and the other ("data.dat") contains data stored at nodes. In MS Windows, you may include disk name information. Be certain that your compiled program has read/write permissions the directory in which you place the files.

The BerkeleyDB global arrays normally reside in a file named "Mumps.DB". The Berkeley DB will also create some files whose names begin with "__db." - these are part of the environment. The data base, however, is resident in "Mumps.DB". The name of the "Mumps.DB" file may be changed in "globals.h".

For Mumps programs, the global arrays are initialized if they do not exist and you access them for the first time. Compiled programs that neither reference nor open global arrays will neither open, create nor initialize the global arrays. The global arrays are automatically opened the first time they are used. If you are using the native globals and they are already in use by another Mumps program, your Mumps program will wait. If you do not use the global arrays, they will not be opened. If you use the Berkeley DB, multiple programs may concurrently access the data base.

If you use the native globals and if your program has a main routine, the globals will be automatically closed at termination. Otherwise, you must explicitly close them with a "zclose" command. Failure to close the globals may result in data and file integrity loss.

Global array references may contain string subscripts. Only printable ASCII characters are permitted as values for global array subscripts. The default collating sequence is ASCII for all indices (including numeric indices).

The global array file(s) will grow in size as elements are added. These file(s) may be copied for backups. Note: there are also builtin functions "$zcd" and "$zcl" that will dump and restore the globals to/from a flat ASCII file (see below).

Global array indices may consist of printable ASCII characters in the range 32 through 126. Normally, subscripts are stored as ordinary ASCII text (note: the BDB has an option to store all index data in an encrypted format).

The Berkeley DB arrays can be used concurrently by multiple programs on the same machine.

To configure the native global arrays, see the file "btree.h.in". The file "btree.h" is automatically generated by "configure" from "btree.h.in" when "configure" is run. Most of the values below can and should be set by "configure" however, they can be set manually such as when working with non-standard, proprietary systems such as MS Windows. The main configuration options are:

ADSIZE set to 4 if your file system supports only 32 bit file offset addressing; set to 8 if your file system supports 64 bit file offset addressing.
UDAT
UKEY
The names of the data and key files respectively. The defaults are: "data.dat" and "key.dat".
SEED
NBR_ITERATIONS
The random number generator seed used in system testing and the number of test iterations.
BLOCK The size of each btree block in bytes. The default is 8192. This number must be a power of 2 from 1024 to 262144. The default value appears to be optimal in testing.
PAGE_SHIFT The number of bits to shift when storing file offsets. File offsets are stored in 4 bytes with trailing zeros removed. The number of zeros is determined by the BLOCK size. A BLOCK size of 8192 corresponds to a PAGE_SHIFT of 13 and a maximum file size of 16 terabytes. This value is set automatically depending on the value of BLOCK.
GBLBUF The number of internally cached btree blocks. The number should be appropriate to your memory size. The total memory required is roughly GBLBUF*BLOCK. The default BLOCK is 8192 and the default GBLBUF is 4097 and these require 33,562,624 bytes. GBLBUF must be a power of 2 plus 1. examples: 9, 17, 33, 65, 1025, 2049, 4097. The value of AMASK is dependent on the value of GBLBUF.
AMASK Associative cache mask. This value is dependent upon GBLBUF. It is used to mask out the high bits of an address to produce an associative array address. The value of the mask should equal (in hex) GBLBUF-2. For example:

if GBLBUF==9 define AMASK 0x7
if GBLBUF==17 define AMASK 0xf
if GBLBUF==33 define AMASK 0x1f
if GBLBUF==65 define AMASK 0x3f
if GBLBUF==129 define AMASK 0x7f
if GBLBUF==257 define AMASK 0xff
if GBLBUF==513 define AMASK 0x1ff
if GBLBUF==1025 define AMASK 0x3ff
if GBLBUF==2049 define AMASK 0x7ff
if GBLBUF==4097 define AMASK 0xfff

If your program crashes after opening and updating the global arrays, the global array file(s) may be corrupt and may need to be deleted and restored. The reason for this has to do with the buffering of global array I/O requests. If the program is terminated and the buffers are not written, the B-tree on disk is invalid. For the Linux versions, SIGINT interrupts are normally trapped and the global array buffers are written to disk. With the Berkeley DB, if the data base is corrupt, your program may hang upon start up. Delete the files "Mumps.DB" and the files that begin "__db" and reload the data base.

The cache size for the BDB is set to 1,000,000 bytes. This may be altered. The maximum cache size supported is 4 gb and the minimum is 20,000. The cache size may be changed in "globals.h" through the "CACHE_GIGS" and "CACHE_BYTES" defined symbols. The BDB page size ranges between 512 and 65536 and may be set with the "PAGE_SIZE" symbol. The default is 1024. The cache size may be overridden at run time if you include a file in the directory in which the Mumps program runs named "DB_CONFIG" and include in that file a line to the effect:

set_cachesize 1 0 0

where the parameters are: number of gigabytes, number of bytes and number of contiguous caches (0 or 1 means one cache, more than 1 means the cache will be broken into multiple parts).

If you have an application that used a large cache size, you must delete the files named __db* before setting the cache to a new size. The __db* files may be safely deleted when no program is accessing them. The data base is actually in "Mumps.DB".

The cache and block sizes for the native globals are set by "configure" - see above. The native array cache is a one-way associative cache. It should be set to as large a value as is practical for the physical memory facilities of your machine. Note: if you have large block and cache sizes, you will notice delays on startup and termination. During startup, the global array buffers are allocated and initialized and during termination, the internal buffers are written to disk. These activities may be time consuming, depending upon your system.

The native global array file system is distributed under provisions of the LGPL license and may be more appropriate for proprietary applications. At present, however, it lacks, however, many of the failsafe and transaction processing features of the BDB.

The native globals underlying btree processor may also be used standalone as a C++ routine. See documentation in "btree.c"


Using Mumps with a Web Server

  1. The CGI interface

    To use a Mumps program with a web server, compile the Mumps program with the cgi.h header file. To do this, place the following line after the zmain:

    + #include <mumpsc/cgi.h>

    This header file contains the code necessary to access the Web server QUERY_STRING environment variable.

    Place the Mumps program in the Web server's cgi-bin (or other appropriate directory). From HTML documents, you reference Mumps programs with lines of the form:

    <A HREF="/cgi-bin/yourprogram.cgi?var1=11111&var2=123"> test </A>

    Here, the name of the compiled (binary) Mumps program is taken to be yourprogram.cgibut in some systems the file extension may need be other than .cgi. Check your web server documentation to determine if executable files require a specific extension other than .cgi. An executable Mumps program must be given adequate permissions to be executed by the web server. Also, the web server must have permission to read and write files in the directory in which the global arrays are placed. Incorrect permissions and file ownership will result in failure. Temporary global arrays are placed in the directory that contains the program being executed. The web server must have permission to write files in this directory, too.

    Upon initialization of the Mumps compiled program (if the cgi.h file was included during compilation), the variables appearing in the HREF (var1 and var2 in the above) will exist in the Mumps symbol table and have the values provided by the browser (11111 and 123, in this case). The web server extracts the parameters from the HREF and places them in an operating system environment variable namedQUERY_STRING. The software in cgi.h looks for QUERY_STRING and decodes it (non alphabetic and non- numeric values are encoded). It takes each parm=value expression and creates a Mumps variable named parm and sets its value to value. If cgi.h is omitted, you will not have access to web server passed values. The total length of the parameter string in QUERY_STRING may not exceed 1024 bytes. The full value of QUERY_STRING is also contained in the Mumps variable %QS during execution. The executing Mumps program may determine that is was invoked by the web server by testing for the existence of %QS. The environment variable REMOTE_ADDR is also captured by cgi.h and stored as the Mumps runtime variable %RA However, you may invoke a Mumps program in simulated Web server mode by calling it from a shell script such as the following (for the Linux Bash Shell):

    #!/bin/bash QUERY_STRING="&var1=value1&var2=value2" export QUERY_STRING yourprogram.cgi unset QUERY_STRING Here, the Mumps program will execute with parameters var1 and var2 If the values of the variables passed through QUERY_STRING need to contain blanks or special characters, they must be encoded in the manner prescribed by the HTML standard. From a Mumps program, the Mumps built-in function $zhtml can be used for this purpose (see below).

    Be certain to Halt at the end of a program in order to terminate the program. If you do not correctly terminate the Mumps program, the web server may hang.

    When using temporary global arrays, the global array names are constructed from the process id and are therefore unique. For programs with permanent native globals not using the server option, Mumps compiled programs insure that only one program is running at any given time on any given database files. Non-server native global Mumps programs open the database files for exclusive access. Thus, non-server native globals Mumps programs should be short, transaction oriented jobs that do not delay the system. When a non-server native globals Mumps program attempts to execute and the files it wants are in use, it waits. Note: since only one copy of a non-server native globals Mumps application is ever running for a given database, all file accesses are also exclusive and thus the Lock command has no meaning. The Berkeley Data Base permits multiple programs to concurrently read from the data base but only one program to update the data base at any one time.

  2. Mumps Apache Modules

    Mumps programs can now become Apache modules under late versions of Linux and Apache 2. This means that your Mumps program can be directly executed as a loadable shared object by the Apache server. In the distribution, there is a code module, mod_mumps.c, that builds a Mumps module that will permit the native Mumps Compiler interpreter facility to execute as an Apache 2 dynamically loadable object. The following are the details to install this facility:

    1. System Requirements:
      • Apache 2.0 with mod_so support built in.
      • Apache development tools (headers and the utility apxs)
      • The Perl Compatible Regular Expression Libraries which are used to support the pattern match operator (libpcre0 and libpcre0-devel) (http://pcre.org).
      • The Berkeley Data Base to handle globals (db4, db4-devel, and db4-utils or higher). Note: you probably have db1 installed as it is used by several system functions. You will need db4 or later also.
      • The OpenSSL library to permit secure encrypted transmission of i data (see http://www.openssl.org). [this package is optional]

    2. Installation Instructions:

      A quick script to to most of what is listed below is in MakeApacheModule. Note: this script does not add the lines to httpd.conf (see below). You must do this yourself. Read the script for details.

      To install this module, you must have Apache installed with the development tools and the Mumps compiler package, as well as a standard C++ compiler (which is required by the previous tools).

      From within the directory of your module source, you need privileges to install modules into the Apache directory. You should be logged in as root to do this.

      To compile the module:

      /usr/local/apache2/bin/apxs -c -a -i mod_mumps.c -lmumps -lmpsglobal_bdb -ldb -lpcre -lapr-0 -laprutil-0

      This will create the module files and install them into the default module directory created by Apache. Then it will edit the httpd.conf file located within the configuration directory of your Apache installation.

      The default installation directory if you build and install with the download from www.apache.org will be /usr/local/apache2. We assume this directory in our code.

      After you compile and install the module you need to edit the httpd.conf to ensure that the correct settings are installed. This file is in /usr/local/apache2/conf.

      Under the Dynamic Shared Object Support in the httpd.conf file. You should see the following line in the section under the description.

      LoadModule mumps_module modules/mod_mumps.so

      This should have been placed by the apxs tool when you create the module, if it is not there search for it in the document and make sure its there.

      After you verify that line exists in the configuration file, you must then tell Apache to give every request for a mumps file to the correct module. This next set of lines accomplishes this goal for every file with the extension of .mps

      <Files *.mps> SetHandler mumps </Files>

      If you have any other extensions associated with Mumps source files you can also add their extensions in the same fashion.

      After you install the module and edit the config file, you will need to start or restart the Apache web server in order for the module to work properly.

      typically:

      /usr/local/apache2/bin/apachectl start

      or

      /usr/local/apache2/bin/apachectl restart

      Congratulations, your installation of a Mumps module is complete. Now you can access Mumps source files directly through the interpreter rather than using CGI on executables.

    3. Notes:

      You may place the *.mps file in any directory but the file must be world readable.

      Then there is the issue of the data base files. These need to be readable AND writable by the web server. Generally speaking, this means they need to be created prior to their first access by the web server.

      You can create them (Mumps.DB, and __db.001, __db.002 and _db.003 - sometimes a few more, or key.dat and data.dat, with a dummy Mumps program:

      zmain set ^a(1)=1 halt

      (use some other variable name if ^a is needed in your app).

      You can now do one of two things:

      • make the data base files world rw
      • make the data base files owned by the webserver (usually user "apache").

      Either technique will work but the first is risky in an unsecure system. The second will generally require system privs.

      The file "mod-mumps.so" is a binary Mumps module that was compiled on a Mandrake 9.1 system. It may or may not work on your system, depending on libraries and other factors. Place it in your modules directory and make the noted changes to httpd.conf It may work. Otherwise, build from scratch.

When running with a web server, it is critical that your Mumps program have sufficient file access permissions to be executed by the web server. This applies not only to executable programs but also to any associated directories and files. Further, all non-server global array files must also be read/write accessible to the web server. On some systems, you must place your web server executables into a specially named directory, such as cgi-bin. Execution of programs by the web server requires complete compliance with these instructions.


Uncontrolled Program Termination

If you halt a non-server compiled program with a control-C or other external Kill command, the native globals global array data base may be corrupted. For temporary files, this means that the temporary files may not be correctly deleted. For non-server permanent globals, the files may be corrupt. Back-up copies of critical data should be maintained. A dump function exists ("$zcd") that copies the global arrays to an ASCII text file which can be used to reload the data base. Generally speaking, programs that do not access the globals or access them only to read their contents, do not corrupt the global arrays if forcibly terminated. Compiled Mumps programs attempt to intercept control-C (SIGINT) signals and terminate without error but this is not always possible.


Program Formats and Error Messages

Mumps programs may be created with any standard system editor which does not introduce embedded control codes into the program. Be careful here. Many word processors embed invisible control codes into your program.


Compiler Implementation Overview

This compiler does not implement the full 1995 Mumps (M) standard. It implements many of the features of the 1995 standard with both omissions and extensions. Work is on-going so the list of features will grow.

The compiled modules execute between 1.5 and 6 times faster than the same code executing on an interpreter. The lower multiplier reflects benchmark programs which are very global array intensive while the higher multiple is for programs that are less global array intensive.

The C++ code generated by the compiler has comments which are the original Mumps program's line and line number. The compiler also interjects its own comments in places to explain what it is doing. The C++ code should be fairly easy to follow. Naturally, you may alter the C++ code to add your own features.

This effort arose out of work to use Mumps as a server-side web scripting language. For this purpose, it is amazingly well suited (see the patient record system examples). To work in in the web server environment, changes and extensions were made on an interpreter written by this author in the early 80's. One set of changes involved the ability to extract from the server environment data passed from browser forms. These data, in the form of name=value pairs, partly encoded, are automatically translated by the interpreter to Mumps variables with initialized values. Also, the need was recognized to be able to embed HTML code in Mumps programs and have the interpreter scan the HTML and substitute any expressions found in it with their results. Example: < title > &~^patient(ptid)~ < /title >

Here, the global array reference is replaced by its value. The &~ and ~ are delimiters. This works very well.

The key to this was the old Mumps interpreter which was very small (about 60,000 bytes under Linux) and quite efficient. It was a single user environment and had no major O/S issues inhibiting its interface. Since web transactions are very brief, serial locking of the data base works very well. A data base stand-alone daemon was written which took queries from a web server invoked database-less version of interpreter but it was actually slower and had many more integrity and security issues. The serial lock version works better. Tests show that a small, quick transaction oriented interpreter works very well.

In order to improve the performance even more, it was decided in November of 1998 to try compiling Mumps by modifying the Mumps interpreter written by this author to generate C++ code and then compiling the C++ code to binary.

One advantage of full compilation is interoperability with other languages and with the host operating system. This is achieved in by compiling to C++ which has full access to all system features.

Be very careful of required syntax in Z commands. See the examples below and in checkout.mps. Error messages derive from both the Mumps compiler and the C++ compiler. Error messages, sometimes many, from the C++ compiler are not be helpful and can cause considerable frustration. You can, however, look in the C++ code to see what original line of Mumps may have generated the error.

The following lists some important changes that have been made. Mainly these center around added commands to adjust to the C++ environment. Mainly, these take the form of program and function structuring commands and data declaration.


Writing Programs, Functions and Calling Conventions

  1. Program Structure

    All programs must have a 'main' function. Because Mumps programs can be used with C++ programs, the 'main' function may be in the C++ routines.

    However, if there is no 'main' C++ routine, you must designate a Mumps routine as 'main'. This is done by means of the "zmain" command which tells the compiler that the code following it should be the 'main' function. The "zmain" command must be the only command on the line on which it appears. It may not have a label.

    If you attempt to compile and link Mumps code that has no "zmain" or C++ "main" function, you will receive error messages from the compiler and the link editor.

    Alternatively, you may compile one or more Mumps functions, none of which contain "zmain" if they are linked to either a Mumps or C++ module that contains a main function. See the section below on "Main programs and Functions"

    The "zmain" command takes no arguments and should be the only command on the line. The line begins with one or more blanks. No label is permitted. The command generates a C++ 'main' program prologue. The prologue contains many definitions required by the generated C++ code. When a main program ends, the epilogue automatically closes the global arrays. On the other hand, global arrays are not automatically closed on conclusion of a sub-function.

  2. Main Programs and Functions

    When you compile your Mumps program, one or more C++ routines will be created. These, in turn, are compiled into binary executable code. The mumpsc compiler produces C++, not binary executables or assembly language.

    There are three ways to introduce functions into a Mumps program environment:

    1. As Mumps internal functions;
    2. As Mumps external functions; and
    3. As C++ functions.

    The simplest of these are the first: internal Mumps functions. These take the following forms:

    1. Inline functions with shared symbol table.

      This form of subroutine was the original form in Mumps. No prarmeters were permitted to be passed to the subroutine. The subroutine shares the same namespace as the calling program hence the values of the variables i, j, and k are accessible to the subroutine and changes to them are relected in the main program.

              zmain
              set i=10
              set j=20
              set k=30
              write "main program: ",i," ",j," ",k,!
              do test
              write "main program: ",i," ",j," ",k,!
              halt
      
      test
              write "sub-program: ",i," ",j," ",k,!
              set i=11
              set j=22
              set k=33
              quit
      
      which produces the following output:
      
      main program: 10 20 30
      sub-program: 10 20 30
      main program: 11 22 33
      

    2. Inline functions with separate symbol table and call by value.

      This form of subroutine call was introduced later. It permits parameters to be passed to the subroutine but the subroutine and calling program have different namespaces. That is, variables in the calling program are not visible to the called program and variables created in the called program are deallocated upon return and are thus not visible to the calling program. Changes to parameters in the called program do not change the corresponding arguments in the calling program.

              zmain
              set i=10
              set j=20
              set k=30
              write "main program: ",i," ",j," ",k,!
              do test(i,j,k)
              write "main program: ",i," ",j," ",k,!
              halt
      
      test(a,b,c)
              write "sub-program: ",a," ",b," ",c,!
              set a=11
              set b=22
              set c=33
              quit
      
      which produces the following output:
      
      main program: 10 20 30
      sub-program: 10 20 30
      main program: 10 20 30
      

    3. Inline functions with separate symbol table and call by reference.

      Same as the above but 'call be reference' permitted. That is, changes to parameters made by the called program cause changes to the corresponding arguments in the calling program. Note the "." in front of the variables in the 'do' command that are to be passed by reference. Both call by reference and call by value arguments may be mixed in the same 'do' statement.

              zmain
              set i=10
              set j=20
              set k=30
              write "main program: ",i," ",j," ",k,!
              do test(.i,.j,.k)
              write "main program: ",i," ",j," ",k,!
              halt
      
      test(a,b,c)
              write "sub-program: ",a," ",b," ",c,!
              set a=11
              set b=22
              set c=33
              quit
      
      which produces the following output:
      
      main program: 10 20 30
      sub-program: 10 20 30
      main program: 11 22 33
      

    In each of the above examples, the subroutine and calling program are actually part of the same C++ function. In effect, subroutines of the type shown above as similar to the old Basic 'gosub' facility. Functions such as shown above may also return values:

    Example recursive factorial computation:

            zmain
            set i=$$factorial(5)
            write "factorial=",i,!
            halt
    
    factorial(a)
            write "sub-program: ",a,!
            if a<2 quit 1
            quit a*$$factorial(a-1)
    
    which produces the following output:
    
    sub-program: 5
    sub-program: 4
    sub-program: 3
    sub-program: 2
    sub-program: 1
    factorial=120
    

    Note: while inline functions that do not take arguments may return values, because they share the same namespace with the calling program, an example such as the above is impractical.

    In addition to inline functions created by the mumpsc compiler, you may compile separate, non-inline Mumps functions in eithe Mumps or C++. Non-inline functions are compiled to separate C++ functions and may be stored in object libraries and linked into executables at independently.

    If you write your own C++ routines that are either called by a Mumps routine or call a Mumps routine, they must obey the Mumps calling conventions.

    A non-inline function is invoked differently that an inline function. For exampe, the may be invoked as functions returning a value:

    set i=$$^MyFunction(a,b,c)
    

    or by means of the "do" command such as:

    do ^MyFunction(a,b,c)
    

    where "MyFunction", when written in Mumps, is usually something of the form:

    ^MyFunction(a,b,c) 
    	write a+b+c,! 
    	quit a+b+c
    

    Non-inline functions should normally be placed at the beginning of your source code file with any other functions prior to "zmain" and any other invocations. Example:

    ^MyFunction(a,b,c) 
          set i=a+b+c
          quit i
    
          zmain
          write $$^MyFunction(1,2,3),!
          halt
    

    (The above will write "6".) Note that non-inline functions begin with a "^". function if you do not place non-inline functions prior to their first reference, you will need a C++ code function header. For example, the above program could have been re-written with the non-in;ine function appearing at the end if a header for the function appeared at the beginning:

    +char * MyFunction(struct MSV *, char *, const char *, const char *, const char *);
    
          zmain
          write $$^MuFunction(1,2,3),!
          halt
    
    ^MyFunction(a,b,c) 
          set i=a+b+c
          quit i
    
    

    Note the C++ function header. The first two arguments reflect internal structure and are always present while the remaining three arguments reflect the parameters actually passed to the function. The initial argument is the address of the runtime state vector and the second argument is a character string giving the label of the line in the subroutine where execution is to begin (rather than at the beginning). Normally, the second argument is the empty string ("") and subroutines begin execution on their first line. Generally speaking, it is easier to place functions prior to their first use, if possible.

    All functions beginning with the "^" character are generated as separate C++ functions and are called using C++ calling conventions. These functions may recursively call themselves and may be called by other C++ programs. Functions beginning with the the "^" symbol push the runtime symbol table on entry and pop the table on exit. Thus, local variables created during subroutine execution are lost upon exit except for those passed as arguments with the "." operator.

    Additionally, functions may be in-line parts of your current program. These are not actual C functions but local groups of code that may be invoked by means of "do" or "$$"-type function invocations. These need not appear prior to their first use. An example of this is as follows:

    zmain set a=2 do abc write "a=",a,! set a=2 set x=$$abc write "a=",a," x=",x,! set a=2 set b=9 write $$xyx(a,b),! write "a=",a,! halt abc write "hello world a=",a,! set a=22 quit "999" xyx(a,b) write "in sub x=",x,! set a=a*b quit a output: hello world a=2 a=22 hello world a=2 a=22 x=999 in sub x=999 18 a=2

    Note that when calling inline functions, the ""^" is omitted. Also note the differences in symbol table use. For compatibility with earlier standards, when the invocation of "abc" is made, the symbol table is not pushed. Thus, the setting of variable "a" in the subroutine has the effect of creating and setting the variable in the calling program too. On the other hand, invoking a subroutine with arguments ("xyx"), results in a symbol table push/pop on entry/exit and changes made to the variable "a" in the subroutine are not effective in the calling program.

    In general, the symbol table rules are:

    1. Variables from calling programs are visible to called programs except when the variable name is otherwise reused as in the case of formal parameters or the new command.
    2. If a subroutine is invoked with arguments, any changes made to a variable will be lost on exit. If a subroutine is not invoked with arguments, changes to variables will visible to the calling program.
    3. Inline subroutines are not recursive. Thus the following will not work:

      ^aaa(a)
            write a,!
            if a=0 quit 0
            quit a+$$^aaa(a-1)
       
            zmain
            set x=$$^aaa(5)
            write "x=",x,!
      

      but the following does work (non-inline functions):

      ^aaa(a)
            write a,!
            if a=0 quit 0
            quit a+$$^aaa(a-1)
       
            zmain
            set x=$$^aaa(5)
            write "x=",x,!
      
      output:
      
            5
            4
            3
            2
            1
            0
            x=15
      

    Note that the "$$" figure is used when the function is being invoked with anticipation of a return value. The "$$" is omited when a function is referenced by a "goto" or "do" command. The "^" is present only when calling separate C++ functions.

    Functions are compiled in one of two ways:

    If the Mumps function name begins with a circumflex ("^"), the Mumps function becomes a C++ function separate from the C++ program "main" function. This is called a "separately compiled function." If the Mumps function name does not begin with a circumflex, the Mumps function becomes an inline section of code within the C++ function in which it appears. Only separately compiled functions may be called recursively.

    A new namespace is created on entry into a Mumps separately compiled function (called with a "^" prefix) and deleted upon exit. Variables passed to the function are copied to the new namespace but they are not copied back to the restored namespace on exit unless passed by the call-by-reference feature. The function may return a value to the calling program through the "quit" command.

    When calling a separately compiled function, variables created in the function are lost when the function exits. Variables known to the calling program are available to the called program unless the called program receives a variable in its argument list with the same name. For example:

    ^test(abc) set x=1 write y,! write abc,! quit abc*2 zmain set abc=888 set y=999 write $$^test(222),! write abc,! write x,! halt

    The above will print:

    999 (from the "write y,!" in the subroutine)
    888 (from the "write abc,!" in the subroutine)
    444 (from the "write $$^test(222),!" in the main program)
    888 (from the "write abc,!" in the main program)
    *** Variable not found in or near line 11

    The error at the end is from the "write x,!" in the main program - there is no value for "x" at this point - the value from the function was deleted on function exit.

    If your function calls itself either directly or indirectly (that is, it is a recursive function), it must be compiled as a separately compiled function (^).

    Please review the notes below under the new command for details concerning symbold table manipulation in functions.

    If the result of a function is the argument to another function, it will be invoked regardless of the function called. Thus, the function "$$abc(1,2,3)" in the following is always invoked even though the "select" terminates after the first operand:

    set i=$select(1:1,0:$$abc(1,2,3))

    Separately compiled functions must appear before they are first referenced or there must be a C++ function header appearing prior the the first use of the function. A C++ function prototype for a function is of the form:

    + char* fcnName(struct MSV * svPtr, char* ep, const char* arg1 ...); alternatively, the built-in define symbol "StateVector" may be used: + char* fcnName(StateVector svPtr, char* ep, const char* arg1 ...);

    "StateVector" is defined in "mumpsc/stateVector.h" as "struct MSV *" and may be used to simplify the code.

    Place as many constructs of the form "char* arg1" as there are formal parameters. The "ep" argument must be present and is separate from the formal parameters. The pointer "svPtr" points to the runtime state vector which contains many parameters concerning the running program. Normally, when calling a Mumps function from a C++ program, the "ep" parameter will be the empty string.

    If you call a Mumps function from a C++ program, you must provide a state vector address. You can do this by calling the built-in function "AllocSV()" as follows:

    #include <mumpsc/stateVector.h> StateVector svPtr=AllocSV(); char *mpsFcnCall(StateVector,char *,const char *,const char *,const char *); . . . mpsFcnCall(svPtr,"",arg1,arg2,arg3); free (svPtr); . . .

    Thus, if you have a Mumps function whose entry point is:

    Fcn(a,b,c)

    its prototype will be:

    + char * Fcn(StateVector svPtr, char * ep, const char *, const char *, const char *);

    Inline functions must contain a quit command. A source code end of file is interpreted as a halt command.

    Examples:

    ^subFunction(a,b,c) write "subFunction() main entry ",a," ",b," ",c,! quit a+b+c ep write "subFunction() ep entry ",a," ",b," ",c,! quit 123 ^test3() write "test3() entered",! quit "test3 returns" zmain write $$^subFunction(1,2,3),!! write $$ep^subFunction(1,2,3),!! do ^subFunction(9,8,7) do ep^subFunction(10,20,30) do ^test3 write $$^test3,!! write $$test1,!! do test1 write $$test2(1,2,3),! do test2(90,80,70) halt test1 write "test1 entered",! quit "test1 returns" test2(a,b,c) write "test2 entered ",a," ",b," ",c,! quit a+b+c

    In the above examples, subFunction() and test3() are compiled as separate C++ functions. These may be called by either Mumps programs or C++ programs. Alternatively, test1 and test2() are compiled as part of the Mumps main routine and these may only be called by Mumps code that is also part of the main routine. test1 and test2() may not be called by C++ programs or Mumps programs compiled as separate C++ functions. Inline functions may not call themselves. Use separately compiled functions for recursive calls.

    Please note the Mumps calling structure. The separately compiled functions may be entered either by the do command, a goto command or as function references. A separately compiled function is a function linked into the executable module as a subroutine. Such a subroutine may be compiled either at the same time or separately from the main ("zmain") routine but it is linked into the final resulting module. All these routines begin with the "^" character. If entered by the "do" command, separately compiled function names are prefixed by the "^" character. If entered as function references, their names are always prefixed by "$$^".

    Internal or inline functions ("test1" and "test2") may be entered by the "do" or "goto" commands or by means of function references if and only if the entry point has parameters. If invoked by "do" or "goto", the name is the same. If entered as a result of a function reference, the name is preceded by 2 dollar signs ("$$").

    You may call separately compiled external functions from C. The function prototype for the Mumps function will be of the form:

    char * FunctionName(struct MSV * svPtr, char * ep, const char * arg1, const char * arg2, ... );

    The parameter "ep" is either the empty string ("") or a string containing the name of a label. If not empty, "ep" contains a string with the name of a label. Execution will begin at that label. If there are arguments, they are passed as character strings. The function returns pointer to character string. The pointer "svPtr" gives the address of the system state vector and is used to access system parameters.

    You may create a separate function in C, either in the same file as the Mumps program or in a separate file by following the format of the prototype given above: it must return a pointer to string, you will receive at least two arguments, the "ep", a pointer to strings, and "svPtr", a pointer to the system state vector. Following these, you will receive the actual arguments in successive pointers to string. Be certain that the pointer you return is not pointing at an automatic variable that will expire upon function exit.

    For example, the following is a C++ function embedded in a Mumps program. It appears at the beginning, before the "zmain" command (all inline C functions should appear prior to any Mumps functions):

    + #include <math.h> + #include <stdlib.h> + char* Bounds (struct MSV * svPtr, char* ep, const char* p, const char* MinFreq, + const char* MaxFreq, + char* pp, char* MinDocs) { + long a,b,c,d,e; + a=atol(p); + b=atol(MinFreq); + c=atol(MaxFreq); + if (a<b || a> c) return "1"; + d=atol(pp); + e=atol(MinDocs); + if (d<e) return "1"; + return "0"; + }

    It is invoked by the following from the Mumps program:

    if ($$^Bounds(%,MinFreq,MaxFreq,%%,MinDocs)) do ...

    Because Mumps is slow when performing arithmetic, it may be desirable to write short C functions to do the arithmetic as the example shows. (The example is taken from the program "reader.mps" contained in "doc/examples/ISR" in the distribution.)

    If your program contains no Mumps "zmain", but does contain Mumps sub-functions followed by C++ code, you must have a "zexit" command to indicate to the Mumps compiler the end of your function or functions. If you elect to disregard this requirement, errors will result. For example:

    # a C++ subroutine called by Mumps + char* Cfcn(struct MSV * svPtr, char * ep, const char * gbl) { + printf("%s ",gbl); + return ""; + } # a Mumps subroutine ^subFunction(a,b,c) write "Main Sub function entry point Entered",! xecute "for i=1:1:10 set ^a(i)=i" write "a=",a," b=",b," c=",c,! quit a+b+c ep write "ep Sub function entry point entered",! for i=1:1:10 set %=$$^Cfcn(^a(i)) write ! quit a*b*c zexit # the C++ main program + int main() { + unsigned char *p1,*p2; + struct MSV *svPtr=AllocSV(); + printf("hello world\n"); + p1=subFunction(svPtr,"","2","3","4"); + printf("First call returned: %s\n",p1); + p2=subFunction(svPtr,"ep","2","3","4"); + printf("Secnd call returned: %s\n",p2); + return 0; + }

    This example contains a C++ main program and sub-function ("Cfcn") and a Mumps sub-function ("subFunction"). The "zexit" command is required after the last Mumps sub-function in order to indicate to the compiler that it has finished the Mumps routines and needs to install the epilogue code. If the "zexit" is missing, the compiler will assume the C++ main routine is part of the preceding Mumps sub-function.

    The "zexit" is not needed if all the C++ functions appear prior to the first line of Mumps code.

    The following is an exhaustive list of examples:

    +char * ddd (struct MSV *svPtr, const char *ep, const char * i) { + /* c functions should appear first */ + printf("ddd entered %s\n",i); + return ""; + } ^aaa() write "^aaa entered",! quit ^bbb(i) write "^bbb entered ",i,! quit ^ccc(i) write "^ccc entered ",i,! quit ^eee() quit "hello world" ^fff() quit "hello other worlds" zmain write "main entered",! do ^aaa do aaa do bbb(456) do ^bbb(777) do ccc do ^ccc(123) do ^ccc(321) do ^ddd(333) do ^ddd(444) write $$^eee,! write $$^fff,! goto ^aaa write "main exits",! halt aaa() write "aaa entered",! quit bbb(i) write "bbb entered ",i,! quit ccc write "ccc entered",! quit

    The out of the above is:

    main sub 1 1 2 sub 2 1 2 main return [root@neamh mumpsc]# cd doc [root@neamh doc]# vi compiler.html [root@neamh doc]# cd .. [root@neamh mumpsc]# xxx.cgi main entered ^aaa entered aaa entered bbb entered 456 ^bbb entered 777 ccc entered ^ccc entered 123 ^ccc entered 321 ddd entered 333 ddd entered 444 hello world hello other worlds ^aaa entered

  3. Linking from C++ Programs

    C++ programs can call upon Mumps methods. See the following example:

    hct.mps
    ^MaxHct(ptid) if $data(^pat(ptid)) quit "0" set i=-1 set max=0 for do . set i=$next(^ptHct(ptid)) . if i<0 break . if i>max set max=i quit max ^MinHct(ptid) if $data(^pat(ptid)) quit "0" set i=$next(^ptHct(ptid,-1)) if i<0 quit 0 set min=i for do . set i=$next(^ptHct(ptid)) . if i<0 break . if i<min set min=i quit min ^AvgHct(ptid) if $data(^pat(ptid)) quit "0" set i=$next(^ptHct(ptid,-1)) if i<0 quit 0 set avg=i for j=2:1 do . set i=$next(^ptHct(ptid)) . if i<0 break . set avg=avg+i quit avg/j ^CountHct(ptid) if $data(^pat(ptid)) quit "0" for j=1:1 do . set i=$next(^ptHct(ptid)) . if i<0 break quit j ^HctValue(ptid,v) if $data(^pat(ptid)) quit "-1" set v=$next(^ptHct(ptid,v)) quit v ^findPatient(pt) if $data(^pat(pt)) quit "1" else quit "0" zexit

    patient.cpp
    #include <mumpsc/libmpscpp.h> char * MaxHct(struct MSV *, char*, const char*); char * MinHct(struct MSV *, char*, const char*); char * AvgHct(struct MSV *, char*, const char*); char * CountHct(struct MSV *, char*, const char*); char * HctValue(struct MSV *, char*, const char*, const char*); char * findPatient(struct MSV *, char*, const char*); class PatientHct { public: PatientHct(); double Max(); double Min(); double Avg(); bool Exists(char*); long Count(); double Value(); private: char ptid[32]; char current[32]; char hctVal[32]; struct MSV * svPtr; }; PatientHct::PatientHct() { strcpy(ptid,""); strcpy(current,"-1"); svPtr=AllocSV(); } double PatientHct::Max() { return atof(MaxHct(svPtr,"",ptid)); } double PatientHct::Min() { return atof(MinHct(svPtr,"",ptid)); } double PatientHct::Avg() { return atof(AvgHct(svPtr,"",ptid)); } long PatientHct::Count() { return atol(CountHct(svPtr,"",ptid)); } double PatientHct::Value() { strcpy(current,HctValue(svPtr,"",ptid,current)); return atof(current); } bool PatientHct::Exists(char *ptnbr){ if (findPatient(svPtr,"",ptnbr)=="1") { strcpy( ptid, ptnbr); return true; } strcpy (ptid, ""); return false; } int main() { double val; char ptnbr[32]; PatientHct pHct; while (1) { cout << "enter ptid "; cin.getline(ptnbr,32); if (!pHct.Exists(ptnbr)) { cout << ptnbr << " not found. Exiting." << endl; break; } if ( pHct.Count() > 0 ) while ( (val = pHct.Value()) > 0) cout << val << endl; else cout << "Sorry, not Hct's for " << ptnbr << endl; } return 0; }

    To compile and link the above, use commands of the form:

    mumps2c hct.mps g++ -c hct.cpp g++ patient.cpp hct.o -lmpscpp -lmumps -lmpsglobal_native -lpcre

  4. Creating and Linking Multiple Mumps Object Files Together

    Mumps functions that are separate C++ functions (i.e., their names begin with two circumflexes), may be separately compiled and linked together from object code modules. For example, if you have a main Mumps function in a file named main.mps that calls upon a function contained in the file test.mps, you may first translate the functions to C++ and then compile them to object code, then link them as follows:

    mumps2c test.mps g++ -c test.cpp mumpsc main.mps test.o

    This assumes that the main function is in main.mps Likewise, test.cpp contains the subroutine also created by the mumps compiler. Only one zmain command is permitted in any final binary executable.

    Be careful not to use the underscore character in function names. In Mumps this means concatenate. Be careful not to use the names of functions already known to the C++ environment, including those names found in "bifs.c" in the distribution.

  5. Applications with Many Program Modules

    Some applications consiste of literally thousands of individual Mumps code modules. These are invoked by one another by constructs such as "do ^xxx" or "do ^@xxx". In interpretive based Mumps systems, these constructs do not create probles as the modules are executed directly from source code or from compressed bytes code representations of the modules.

    In a compiled environment, however, these constructs are more problematic. In particular, the mocules must be pre-compiled and they must be available to be dynamically loaded at run time.

    If all modules in a large system, such as the VA Vista, were compiled into one, large single executable, it would be too large to load on even the largest machines. Consequently, we have developed the dynamically linked shared object facility described Parameters to Subroutines

    From Version 6, the compiler supports both call by value and call by reference forms of parameter passing for compiled code. This option is not presently supported in the interpreter.

    Arguments are passed by reference if their names are preceded in the calling routine by a decimal point. Otherwise, they are call by value.

    Call by reference variable that are altered in a called routine are altered in the calling routine whereas call by value variables are not affected by changes made to them in subroutines. The decimal point precede variable names to be called by reference only in the calling routine.

    Example:

    ^tst(a,b) set a=100 set b=200 quit zmain set a=0 set b=0 do ^tst(.a,b) write a," ",b,! halt The above will write "100 0".

    If a calling parameter other than the first or last is omitted, a zero will be inserted in its place. For example:

    ^abc(a,b,c,d) write a,b,c,d,! quit zmain do ^abc(1,,,,4) will be understood as though you had written: do ^a(1,0,0,4)

    You may not omit the first or final parameter.

  6. C++ style language comments

    You may place a comment on a line of Mumps (in compiled code only - this feature is not available in the interpreter) by beginning the comment with "//". One or more blanks should precede the comment except following argumentless commands in which case 2 or more blanks must precede the "//". The "//" may also be used at the beginning of a line starting in column 1.

  7. The "do" and "goto" commands

    The "do" and "goto" commands in the compiler may reference function names and it may pass parameters if the functions are compiled into the executable module or are local to the program. See the examples in checkout.mps. This means that constructions of the form: "do ^abc.mps" require that "abc.mps" be precompiled either as part of the current source code module or linked into the final executable by means of object modules included with the link edit or by means of shared object libraries. The same restriction applies to the "goto" command.

    In the interpreter, however, the construction "do ^abc.mps" dynamically invokes the source code program "abc.mps" and interprets it directly. This mode of operation, however, is much less efficient as interpretation is about 5 times slower than compilation.

  8. Compiling Programs

    The command to compile, link and execute the demo program (under Linux) is:

    mumpsc checkout.mps

    where mumpsc is the name of the Linux shell script provided with the distribution. The script compiles your Mumps program (checkout.mps) to C++ and then compiles the C++ program to an executable that will be named checkout.cgi.

    Some C++ compilers check for error/warning conditions that others do not. The code generated by this compiler compiles and executes correctly on the MS Visual C++ 6,0 and Linux C++ compilers. If you have problems, please send me mail giving the type of compiler and the error. Some DOS compilers do not provide all required library functions. If this is so, you will need to write the missing functions. See any standard Linux system for any needed definitions.

    The Mumps Compiler will generate error messages which are self explanatory. In rare cases, some errors will be detected by the C++ compiler or the loader. Loader messages usually refer to subroutines you have named but not included. C++ compiler messages will contain a line number. See the C++ module for your program at the line number given. Look on the preceding lines and you will see, perhaps several lines above, a line showing the original Mumps code in your program that caused the error.


Hybrid Mumps and C++ Programs

With version 8, the Mumps Compiler produces C++ translations of Mumps source programs and these translations, in conjunction the the MDH (Multi-Dimensional and Hierarchical) Library (also distributed with the Mumps Compiler) may be integrated into larger C++ applications. This, for example, permits C++ programs and class libraries to directly access Mumps global arrays and to manipulate and navigate the arrays in the same manner as in Mumps. It also permits C++ programs and classes to directly invoke Mumps routines, including routines which use indirection. For the first time, Mumps and C++ programs can share the same variables.

For example, consider the following hybrid C++/Mumps program:

^Avg(test) set total=0,count=0 set id="" for do . set id=$order(^Labs(id)) . if id="" break . set name="" . for do .. set name=$order(^Labs(id,name)) .. if name="" break .. set date="" .. for do ... set date=$order(^Labs(id,name,date)) ... if date="" break ... set total=total+^Labs(id,name,date) ... set count=count+1 set avg=-1 if count'=0 set avg=total/count quit avg zexit + int main() { + mstring name; + mstring id; + mstring date; + mstring test; + mstring result; + mstring avg; + global Labs("Labs"); + + while (1) { + cin >> id; + if (cin.eof()) break; + cin >> name >> date >> date >> test >> result; + Labs(id,name,test,date)=result; + } + cin >> test; + avg = Avg(svPtr, "", test); + cout << "average for test " << test << " is " << avg << endl; + return 0; + }

Where "mstring" is an MDH data type that mimics the typeless varaibles in Mumps. Instances of class "global" are global arrays which can be accessed by either Mumps or C++ routines. For example, the above could have been expressed entirely in C++ as follows:

#include <mumpsc/libmpscpp.h> global Labs("Labs"); double Avg(mstring test) { double total=0.0,avg=-1.0; int count=0; mstring id,name,date; id=""; while (1) { id=$order(Labs(id),1); if (id == "") break; name=""; while (1) { name=$order(Labs(id,name),1); if (name == "") break; date=""; while (1) { date=$order(Labs(id,name,date),1); if (date == "") break; total=total+Labs(id,name,date); count=count+1; } } } if (count != 0) avg=total/count; return avg; } int main() { mstring name; mstring id; mstring date; mstring test; mstring result; mstring avg; while (1) { cin >> id; if (cin.eof()) break; cin >> name >> date >> date >> test >> result; Labs(id,name,test,date)=result; } cin >> test; avg = Avg(test); cout << "average for test " << test << " is " << avg << endl; return 0; } Additionally, it is possible to share variables between Mumps and C++ programs. Assume the data base from the above example and consider the following:

^FindNext(test,max)
      for  do
      . set id=$order(^Labs(id))
      . if id="" break
      . set name=""
      . for  do
      .. set name=$order(^Labs(id,name))
      .. if name="" break
      .. set date=""
      .. for  do
      ... set date=$order(^Labs(id,name,date))
      ... if date="" break;
      ... if ^Labs(id,name,date)>max goto exit
      quit 0
      export name,id;
exit  quit 1
      zexit

+ int main() {
+       mstring name("name");
+       mstring id("id");
+       mstring date;
+       mstring test;
+       mstring result;
+       mstring avg;
+       mstring max;
+       global Labs("Labs");
+
+       cin >> test >> max;
+       id="";
+       while (1) {
+           result=FindNext(svPtr, "", test, max);
+           if (result == 1) cout << id << " " << name << endl;
+           }
+       return 0;
+       }

Note that "mstring" variables "name" and "id" in the C++ program are declared with string constants. These are the Mumps runtime symbol table names for these variables. When the Mumps function is run, these variables are availble to the Mumps program. The "export" command is similar to the "export" shell command in Linux: it causes the named variables to be exported from the current level of the symbol table to the outer layers. Otherwise, all local variables would be lost on exit. On entry to the Mumps routine, any variable ("id" in this case) not found in the current symbol table layer is sought in outer layers.

The Mumps Compiler now generates C++ code and thus supports the C++ based MDH Toolkit. The MDH Toolkit not only includes classes and methods that provide a Mumps personality to C++ programming, but also a large number of specialized functions to manipulate matrices, genomic and text data.

Implementation Notes

  1. Line Structure

    A line with a pound sign in column one is taken as a comment.

    A line beginning with a percent sign (+) is taken as in-line C++ code.

    No label may have the same name as a variable. Use of labels and variable names that are the same can cause unpredictable behaviour.

    After the last Mumps argument or command on a line, two adjacent forward slash characters (//) are permitted. The remainder of the line will be interpreted as a comment. Note: if the last Mumps command on the line takes no arguments (for example, the argumentless DO), there MUST be at least two spaces between the argumentless command and the // (if present). Otherwise, one or more blank must precede the //.

    Line execution level codes may be inserted in a line as the first item following the blank character. The line level codes are from one to ten decimal points followed by exactly one blank. Lines with line levels greater than the current line exception level are skipped. Only ten levels are permitted.

    Commands may be abbreviated or fully spelled out. As this is a compiler, not an interpreter, there is no penalty for writing readable code. If more than one character is used in a command name, it must be fully spelled out.

  2. The default IO unit number is 5. Unit 5 refers to the console terminal (usually a CRT) for both input and output. When running as a sub-process of a web server, unit 5 output is sent the the web server. Units 1 through 10 may be used by the user for sequential file structures. Unit 5 may not be opened or closed. An attempt to do so will result in an error message. Units 1 through 10 require a file name after the unit number expression in the Open statement. The unit number expression must be followed by a colon and the file name must follow the colon. The file name may be represented as a variable or literal constant. Literal constant file names are enclosed in quotation marks ("). The file name must be followed, either within the quotes if given as a literal or as part of the value if given as a variable, by either ,OLD or ,new. This parameter indicates whether the file is to be created (output implied) or read (input implied). For example:

    open 1:"MYFILE.DAT,new" open 2:"TABLE.DAT,OLD"

    In the first of these, the file will be created and only output (write) operations are permitted. In the second, the file is presumed to exist and only input operations will be permitted. If the new option is given, any previous generation of the file is deleted.

  3. During read operations to units one through ten, the end of file condition is signaled by the return of a null string and the setting of $test to zero when the input is exhausted. A read operation which is successful sets the $test variable to one.

  4. The ASCII NUL (0) character may not be used. Usage will result in a parser error. No ASCII character of value greater than 127 may be generated or used.

  5. No string, program line or intermediate result may exceed 512 characters in length. An attempt to create a string of length greater than 512 may result in truncation without an error message. The user should also note that when functions are evaluated, the length of the function is the length of its name, parentheses, commas and arguments after evaluation. Global array references (keys) can be up to 255 characters long.

  6. The name of the variable used in a "setpiece" function must be 63 or fewer characters in length.

  7. Numbers may not contain more than 20 characters including leading zeros, plus and minus signs, and decimal points. Numeric quantities are stored internally as double variables (double precision floating point). Some round off may occur in certain operations in the usual manner.

  8. When global arrays are built, numeric indices are stored in ASCII character order.

  9. You must conclude your last line of output with a newline command (!) or the contents will be lost when the program closes.

  10. The file name "Mumps.Locks" is reserved for system use (see discussion under the LOCK command).

  11. Unsupported and Non-Standard Features

    1. The "goto" argument must be a local label with no offset or the name of a function. If the label is not found, the error message will be from the C++ compiler or linker, not the Mumps compiler.

    2. Functions: "$fn", "$st" are not supported. Full numeric formatting facilities are available via the C++ "sprintf" function.

    3. Special variables: "$device", "$ec", "$es", "$et", "$key", "$quit", "$stack", "$system", "$tl", "$tr" are not supported.

    4. The "TCommit", "TREstart", "TROllback", and "TStart" commands are not supported.

    5. Functions may be invoked in the for command as in:
      for i=$fcn(i)
      for i=1:$fcn(i):10
      for $fcn(i)
      

  12. Program size limit

    Compiled Mumps source programs may be of any length. Interpreted programs may not exceed the maximum size of the "ibuf" cache set by "configure". This number may not exceed 32,367 and it would not be prudent to lower it below 4,096.

  13. Substitute global array processor

    You may substitute your own global array processor. The routine Mglobal(...) is the global array handler. Details below. At present, if you want to run with no global arrays, substitute a dummy routine for Mglobal(...).

  14. Inline C++ code

    If a line begins with a percent sign (%) or plus sign (+) in column 1, the remainder of the line is treated as an inline C++ statement. If a parameter to a Mumps subroutine begins with a plus sign (+), it is interpreted as a C++ data item. C++ statements can access Mumps variables by means of the sym_() function. It is found in sym.c and is documented both there and above under functions. The following is an example of mixing C++ and Mumps:

    ^test(a,+long i,+long j) set i=0 + for (i=0; i<j; i++) { set i=i+1 write a,i,! + } The Mumps function "test()" receives on Mumps argument ("a") and two C++ argeuments ("i" and "j"). The C++ arguments are used in the inline C++ code. Note that the C variable i is not the same as the Mumps variable i.

    If you write inline C++ functions, place them at the beginning of the source code prior to any Mumps code.

    The Mumps symbol table can be accessed from inline C++ code by calls to the symbol table routine sym_(int, unsigned char*, unsigned char*). The first argument is the operation: 0 for retrieve and 1 to store or create. The second argument contains the name of the variable and the third is the returned value or the value to be stored, depending on the operation. A NULL pointer is returned on error.

    Inline C++ code placed immediately after a dotted indent is considered to be part of the dotted indent block.

  15. "$zd" date functions

    Some C++ compilers produce erroneous results for some of the Z-type date functions. Please verify on your configuration.

  16. Math package

    Arithmetic operations are carried out by routines in arith.c. These are: "add()", "sub()", "mult()", "divx()", "numcomp()", and "expx()". These functions perform arithmetic in double precision floating point or "long" as appropriate and convert the result back to string. Thus, integers may range between approximately +/- 2 billion. Floating point arithmetic is carried to approximately 15 digits of precision. You may change the data types to longer types, such as "long double" or "long long" if your application requires the additional digits of accuracy. Be certain to change the parameters to the converting routine ("gcvt()") as well. Using longer arithmetic data types will slow execution. You may substitute your own routines here if you want another style of calculation (for example, BCD). Values are passed to these functions as character strings and the callers expect a character string in return.

  17. The file name "Mumps.Locks" is reserved for system use. See discussion under the "lock" command below.

  18. Operators

    The modulo operator follows the C++ language conventions and the Mumps expression x#y is implemented as the C++ code x%y. Mumps and C++ calculate modulo differently for reasons unknown to me. This affects the results when one of the operands is negative and the other is positive.

  19. Indirection and Interpretation

    If you use indirection ("@" operator) or the "xecute" command, the run-time interpreter will be incorporated into your executable. It will add about 50k to the final size of your program.

    Both forms of indirection, the "@" operator and the "xecute" command require the run-time interpreter. Indirection should be avoided, if possible, as interpretation if considerably slower than direct execution.

    Indirection is implemented by run-time interpreter routines that directly execute Mumps code. The run-time interpreter is invoked by the "@" operator and by the Xecute command. If a command contains an instance of the "@" operator, the entire command will be interpreted by the run-time interpreter.

    The run-time interpreter permits some language constructs not supported by the compiler and, alternatively, there are some compiler supported constructs not supported by the run-time interpreter. Please test your code thoroughly. Because the run-time interpreter does not examine a line of code or expression until actually called upon to interpret it, it is possible for erroneous code to remain undetected until run-time.

    The builtin interpreter that handles indirection does not presently support arguments to functions or expressions of the form "@xxx@(abc)".

    The run-time interpreter uses the same symbol table and global arrays as the compiler. Thus, data may be exchanged between compiled and interpreted routines.

    Errors detected by the interpreter will generally halt the program you are unning. You may, however, trap errors from teh interpreter. This is discussed in Error Messages.


Relational Algebra for Global Arrays

Global arrays can be manipulated as relations using a set of provided relational algebra programs. The functions to do these are in the file libmpsrdbms.mps in mumpsc/Mumps in the standard distribution. In the following examples, the names of global arrays to be manipulated are enclosed in double quotes without the leading circumflex(^) character:

  1. Union("global1","global2","global3")

    The rows of global arrays global1 and global2 are merged into the global array global3 with duplicate rows collapsed. Both input arrays must be of the same number of columns and corresponding columns should be from the same domain of values. The resulting array will be of the same number of columns. For example:

    	set ^a("apple","$0.95")=""
    	set ^a("pear","$1.25")=""
    	set ^a("orange","$1.50")=""
    
    	set ^b("apple","$0.95")=""
    	set ^b("banana","$0.80")=""
    	set ^b("peach","$1.00")=""
    
    	do ^Union("a","b","c")
    
    	yields:
    
    	^c("apple","$0.95")
    	^c("banana","$0.80")
    	^c("orange","$1.50")
    	^c("peach","$1.00")
    	^c("pear","$1.25")
    

  2. Project("in","out","cols")

    Columns specified by the string cols from the input array in are copied to the out array. The string cols is one or more numbers in the range 1 through 9 with no embedded blanks or other delimiters. Only the first 9 columns of a relation may be projected at this time. The list of digits specifies in order the columns in the output array to be taken from the source array. A given column number may bre present more than once, in which case, that column will be reproduced more than once. Example:

    	set ^a("apple","$0.95")=""
    	set ^a("pear","$1.25")=""
    	set ^a("orange","$1.50")=""
    
    	do ^Project("a","b","1")
    
    	yields:
    
    	^b("apple")
    	^b("pear")
    	^b("orange")
    
          do ^Project("a","c","21")
    
          yields:
    
          ^c("$0.95","apple")
          ^c("$1.25","pear")
          ^c("$1.50","orange")
    

  3. Subtract("in","sub","out")

    Copies global array in to global array out except those rows that are also in sub. That is, out consists of rows from in minus those rows of sub. Example:

    	set ^a("apple","$0.95")=""
    	set ^a("pear","$1.25")=""
    	set ^a("orange","$1.50")=""
    
    	set ^b("apple","$0.95")=""
    	set ^b("banana","$0.80")=""
    	set ^b("peach","$1.00")=""
    
    	do ^Subtract("a","b","c")
    
    	yields:
    
    	^c("pear","$1.25")
    	^c("orange","$1.50")
    

  4. Intersect("a1","a2","out")

    The array out will consist of those rows common to a1 and a2. Example:

    	set ^a("apple","$0.95")=""
    	set ^a("pear","$1.25")=""
    	set ^a("orange","$1.50")=""
    
    	set ^b("apple","$0.95")=""
    	set ^b("banana","$0.80")=""
    	set ^b("peach","$1.00")=""
    
    	do ^Intersect("a","b","c")
    
    	yields:
    
    	^c("apple","$0.95")
    

  5. Select("a","b","exp")

    Rows from a are copied to b if the expression in exp is true. The expression must be a valid Mumps expression with care taken to fully parenthesize complex sub-expressions. The expression is not checked until runtime. Column elements of the a array may be addressed as a(1), a(2), .... Example:

    	set ^a("apple","$0.95")=""
    	set ^a("orange","$1.50")=""
    	set ^a("pear","$1.25")=""
    
    	do ^Select("a","b","a(2)>1.00")
    
    	yields:
    
    	^b("orange","$1.50")
    	^b("pear","$1.25")
    

  6. Join("a1","a2","out","exp")

    The output array out is constrcuted by concatenating rows from a1 with rows from a2 if the expression in exp is true. Column elements of the first array are addressed as a(1),a(2),... and column elements of the second array are addressed as b(1),b(2),.... The expression must be a valid Mumps expression. Joins can be costly interms of computer time. Example:

    	set ^a("apple","$0.95")=""
    	set ^a("pear","$1.25")=""
    	set ^a("orange","$1.50")=""
    
    	set ^b("apple","$0.95")=""
    	set ^b("banana","$0.80")=""
    	set ^b("peach","$1.00")=""
    
    	do ^Join("a","b","c","a(1)=b(1)")
    
    	yields:
    
    	^c("apple","$0.95","apple","$0.95")
    

  7. ^TablePrint(arr,indt, indtchr);

    Prints the global array in tablular form. The parameter "indt" is the number of positions between columns whereas "indtchr" is the character repeated between columns.

    +#include <mumpsc/libmpsrdbms.h> zmain +global a("a"); for i=1:1:3 do . for j=1:1:3 do .. for k=1:1:3 do ... for m=1:1:3 do .... for n=1:1:3 do ..... for p=1:1:3 do ...... for q=1:1:3 do ....... for r=1:1:3 do ........ for s=1:1:3 do ......... set ^a("a"_i,"b"_j,"c"_k,"d"_m,"e"_n,"f"_p,"g"_q,"h"_r,"i"_s)="" do ^TablePrint("a",1," ") prints: a1 b1 c1 d1 e1 f1 g1 h1 i1 a1 b1 c1 d1 e1 f1 g1 h1 i2 a1 b1 c1 d1 e1 f1 g1 h1 i3 a1 b1 c1 d1 e1 f1 g1 h2 i1 a1 b1 c1 d1 e1 f1 g1 h2 i2 a1 b1 c1 d1 e1 f1 g1 h2 i3 . . . a3 b3 c3 d3 e3 f3 g3 h2 i1 a3 b3 c3 d3 e3 f3 g3 h2 i2 a3 b3 c3 d3 e3 f3 g3 h2 i3 a3 b3 c3 d3 e3 f3 g3 h3 i1 a3 b3 c3 d3 e3 f3 g3 h3 i2 a3 b3 c3 d3 e3 f3 g3 h3 i3


Programming Examples

############################################################################# # Example program to collect, store, and retrieve lab tests in a global array ############################################################################# strt Read !!,"Enter Patient ID of <cr> to quit: ",ptid If ptid="" Write "bye",!! Halt Read "Enter R for Retrieve or S to store: ",op If op="S"!(op="s") Do . Write "For patient ",ptid," enter test name: " Read test . If test="" Goto err . Write "For patient ",ptid," enter date: " Read date . If date="" Goto err . Write "For patient ",ptid," enter result: " Read rslt . If rslt="" Goto err . Set ^patient(ptid,test,date)=rslt . Write "For patient ",ptid," test ",test," date ",date," result ",rslt," stored",!! . Goto strt If op="R"!(op="r") Do . If $Data(^patient(ptid))=0 Write "Patient not found",! Goto strt . Write !,"For patient ",ptid,!! . Set test=-1 . For i=1:1 Do .. Set test=$Next(^patient(ptid,test)) .. If test=-1 Break .. Set date=-1 .. For j=1:1 Do ... Set date=$Next(^patient(ptid,test,date)) ... If date=-1 Break ... Write ptid,?10,test,?20,date,?40,^patient(ptid,test,date),! . Goto strt err Write "Sorry, code not recognized",!

GUI's, Gtk and Glade

Glade is a "drag and drop" graphical user interface builder that uses the Gtk graphics library. Glade performs many of the same functions as Power Builder and other related systems and permits you to quickly build GUI interfaces for your programs

While most Mumps implementations are not compatible with GUI builder software, the output of the Mumps Compiler is. This section gives an example of how to build a simple GUI interface for a Mumps program that extracts data from the global array data base.

In this example, the user will type in a name and the program will do a lookup on the name and reply with the phone number associated with the name. The data base consists of a global array called "name" whose indices are names and whose values are phone numbers:

set ^name("jones, j. j.")="508 778 1234" set ^name("smith, j. j.")="508 778 1235" set ^name("doe, j. j.")="508 778 1236" set ^name("wright, j. j.")="508 778 1237" . . .

First, start glade. If not installed, you may need to install it or download a free copy from:

http://www.gtk.org

Glade is released under the GNU GPL/LGPL licenses. Once glade is installed, type "glade" to your command prompt. This will cause three windows to appear on your screen (note: you will need the jpeg files associated with the Mumps Compiler Programmers' Guide in order to see the screen shots used in this section):

In the left window is a collection of widgets that you may drag and drop into your application. In the middle window are the controls and a list of resources and in the right window are the controls used to set the properties of the widgets. The beach in the background is where you'll be with all the free time you have because this is so quick...

Next, click File | New Project in the middle window. Now click File | Project Options in the middle window and see a display like:

The default settings are correct except you may or may not want to edit the project name and the directory into which it will be placed. Shown in the image are defaults from the author's system.

Click "OK" when done. Next, go to the left window and click on the Gnome button. A new group of icons will appear. Place your mouse over the upper left ("Gnome Application Window"), click the mouse and a new window will appear - a skeletal application window. At the same time, the project window (the original middle window) will indicate "app1":

In the new window, click over tab (docking tab) that appears to the left of the "File Edit View..." box. Note that an outline will appear. Right click the mouse and select "Delete". The menu bar will disappear. Do the same for the bar of buttons. These are not needed for this example. Note that no boxes remain. If they do, click on them, then right click and delete. your screen should look like this:

Click on the "Gtk+ Basic" button in the left most window. Click on the "Vertical Box" icon (row 8, column 2). Click in the cross-hatch area of the GUI window. You will see a pop-up box asking how many rows - select "2" and click "OK". Click on the "Horizontal Box" in the left most window (row 8, column 1) and click in the upper frame of the GUI window. Select 2 columns and "OK". Select "Text box" (row 2, column 4) from the left most box and click into the lower from of the GUI. Click on "Text Entry" (row 2, column 2) and click in the right frame of the top frame of the GUI window. Click on the "Button" (row 3, column 1) of the left most window and click in the upper right frame of the GUI window. Your screen should now look like (you may have different proportions to you boxes. These can be adjusted) this:

Now click the upper right window (the "Button" window) of the GUI. Note the Properties window repainted. Now click the "Signals" tab in the "Properties" window. Click the button to the far right of the entry "Signal:" (near the middle of the screen, not the one near the top). A new window will appear with signals. Under "Gtk Button Signals" click "clicked":

Click the "Add" button at the bottom of the "Properties" window. The signal will now appear in the top "Properties" window.

Now in the main project (center) window, click on "app1". This will cause the properties window to re-draw. In the Properties window, click on "Signals". Click the button to the right of the "Signal:" entry mid-way down the window. In the pop-up list of signals, scroll to the bottom and select "Destroy". Click "OK". Click "Add". This adds a handler to handle the case when the user clicks on the window's "X" box (destroy) on the upper right hand corner of the window.

Now in the main project window, click "File | Build Source Code" and then "File | Save"

Now you are ready to write your Mumps programs. Move to the projects "src" directory. Enter the following Mumps program:

^getphone(nme,+GtkText *p1) set i=-1 for do . set i=$next(^name(i)) . if i="-1" break . if i=nme do .. set out=i_" "_^name(i)_$c(10) + gtk_text_insert(p1,NULL,NULL,NULL,Mget("out"),-1); .. goto exit set i=1 + gtk_text_insert(p1,NULL,NULL,NULL,"Not Found\n",-1); exit quit ""

The above program assumes the existence of the "^name(i)" array as shown at the beginning of this section. You need to create this array separately in the directory from which you will run your GUI application.

The program is simple but sufficient to demonstrate the principles involved. Actual applications will be considerably longer and may involve multiple subroutines. Program details:

  1. The "^" preceding the program name ("getphone") indicates to the Mumps Compiler that this is to become a C++ language subroutine.
  2. the expression "+GtkText *p1" in the parameter list is a C++ language pointer being received from the calling routine (the GUI interface itself). The "+" indicates to the compiler that the next argument is not a standard Mumps argument.
  3. The lines beginning with "+" are inline C++ code. They are calls to the Gtk routines to write output to the GUI text box. The first of these lines writes the value of the variable "out" to the text box. The function "Mget()" returns a pointer to the entry in the Mumps symbol table for the variable name given as an argument (or NULL if not found). The Gtk routine requires the pointer "p1" provided by the calling routine.

Next, compile the Mumps program to C++ (but not to an executable binary) by typing to the command prompt:

mumps2c getphone.mps

(where "getphone.mps" is the file name for the Mumps subroutine.

Next, move the directory above the "src" directory and execute file "autogen.sh" which will build the configuration files. When done, return to the "src" directory.

Next, insert the call to "getphone()" into the GUI "callbacks.c" routine. When you clicked "Build Source Code" above, several files were created in the project "src" directory. One of these is "callbacks.c" which will contain the modules that will be called by the GUI for certain events. Modify "callbacks.c" to look like the following:

#ifdef HAVE_CONFIG_H # include <config.h> #endif #include <gnome.h> #include "callbacks.h" #include "interface.h" #include "support.h" #include <mumpsc/stateVector.h> #include "getphone.c" // ***** gboolean on_entry1_enter_notify_event (GtkWidget *widget, GdkEventCrossing *event, gpointer user_data) { return FALSE; } void on_button1_clicked (GtkButton *button, gpointer user_data) { GtkText *p1; // ***** GtkWidget *entry; // ***** const gchar *text; // ***** struct MSV *svPtr=AllocSV(); /* state vector for Mumps */ // ***** p1=GTK_TEXT(lookup_widget(GTK_WIDGET(button),"text1")); // ***** entry=lookup_widget(GTK_WIDGET(button),"entry1"); // ***** text=gtk_editable_get_chars(GTK_EDITABLE(entry),0,-1); // ***** getphone(struct MSV svPtr, "",(char *) text, p1); // ***** } void on_app1_destroy (GtkObject *object, gpointer user_data) { gtk_exit(0); // ***** return FALSE; // ***** }

The modifications are (marked with "// *****" in the above):

  1. Insert the line "#include "getphone.c". This includes your Mumps program (the C++ version) and provides access to the Gtk "define" symbol environment.

    Insert the body of the function "on_button_clicked()" This is mainly "boiler-plate" code. It is invoked when the "Lookup" button is clicked on the GUI. It gets the pointer to the output text box ("p1") and a pointer to the input text ("text") and then invokes your Mumpsc subroutine ("getphone()"). The arguments to "getphone()" are:

    1. A empty string normally used to define alternative entry points but not used here. This string does not appear among the arguments in the Mumps program but is, instead, captured by the Mumps linkage routine.
    2. The pointer to the string containing the input - this corresponds to the first argument in the Mumps program
    3. The pointer to the output list box received as the C++ pointer "p1" by the Mumps program.

  2. Insert the body of the function "on_app1_destroy()" This code is called when the "close window" box is clicked (small "X" in the upper right hand corner of the window frame).

Next you need to modify the "Makefile" to include Mumps linkage. Edit Makefile (in "src") and find the section that looks like (search for "LINK"):

project2: $(project2_OBJECTS) $(project2_DEPENDENCIES) @rm -f project2 $(LINK) $(project2_LDFLAGS) $(project2_OBJECTS) $(project2_LDADD) $(LIBS)

and replace the "LINK" line with the following:

$(LINK) $(project2_LDFLAGS) $(project2_OBJECTS) $(project2_LDADD) -ldb -lmumps -lmpsglobal_bdb $(LIBS) -lpcre

The added link edit references are to Mumps libraries needed by the Mumps program. Note the "-ldb" loads the Berkeley DB libraries. By default, Glade loads the db1 version. The "-ldb" should point to db3 or higher. Placement of "-ldb" prior to "LIBS" is important as it causes the db3 libraries to be preferred.

Next, you need to build the project. From the "src" directory, type:

make

This will compile and build your application. You may get some warning messages regarding unused variables. This is mainly because the Mumps code only consists of a subroutine and some symbols are not being used. When done, the executable will be in the "src" directory and named after your project. If you permitted the project name to default, it will be called something like "project1" or "project2" and so forth. You may run this program which should look something like this (a search string has been inserted and the "LookUp" button clicked):


How To Write CGI Mumps Scripts


Writing Mumps scripts for CGI execution is relatively simple. The main difference between writing ordinary console based Mumps programs and those to be executed in the CGI interface of a web server concerns input/output.

Output

Normally in a Mumps program, you use the write statement to generate program output that appears as generated on the console running the Mumps program. When running under a web server, however, all your output will be captured by the web server and sent to the web browser. The web browser will format and place your output on the browser's screen.

In order to control the placement, font size, color and other factors concerning the display of your output, you must embed in your output HTML codes. The browser will use these in determining the manner in which to display your output.

Any output you write to the default output device (unit 5) will be sent to the browser by the web server. You may use the write statement to send both text to be displayed as well as HTML codes. For example, given that the variable ptid contains "1234":

write "<center> Patient ",ptid,"</center>",! will send the text: <center> Patient 1234 </center>

to the browser and this will cause the text to be centered on the browser's screen.

In order to speed the development process, this Mumps also supports another form of output that allows easier mixing of HTML and Mumps code.

In html commands, the following substitutions will take place:

  1. &! codes which will be replaced by line feeds (same as the ! in the write statement.

  2. &~expression~ codes whose expression will be evaluated and the result substituted for the entire code. Thus, the above example could also appear as:

    < center > Patient &~ptid~ < /center >

The following is a complete example showing both write based and the extended output method mixed together.

zmain html Content-type: text/html &!&! html <html> <body bgcolor=SILVER><font size=3> html <center><form method="get" action="verify.cgi"> html <center>Select Patient</center> html <select name="ptid" size=8> set ptid=-1 for do . set ptid=$next(^patient(ptid)) . if ptid<0 break . html <option value="&~ptid~"> &~^patient(ptid)~, &~^patient(ptid,"addr1")~, . html &~^patient(ptid,"city")~, &~^patient(ptid,"state")~ write "</select>",! html <p><input type="submit" value="Select"></form></center></body></html> halt

This program creates a listbox containing the names of patients:

Note the following:

  1. The first line (Content-type...) must be the first line of output sent to the web server. It must be followed by exactly two newline characters.

  2. The next line establishes the browser environment.

  3. The program then initiates an HTML form and sets verify.cgi as the program that will execute to process data from this form. The web server invokes the verify.cgi and passes to it parameters that are sent by the browser when the user submits the form. Other parameters, passed by the browser in the form: parm=value will be parsed on receipt by the Mumps program verify.cgi when it is initialized. The name used as "parm" will become a Mumps variable and the value associated with it will be assigned to this variable when verify.cgi begins execution.

  4. The HTML code now begin the setup of a listbox. The item selected from the box will become the value part of the ptid=value token that the browser will pass to the server.

  5. Mumps program scans the patient data base for names.

  6. Each patient name is entered into the listbox and it is associated with the value of the patient's id number. If a user clicks on a patient name and then on the Submit button, the token ptid=value is filled in by the browser with the patient's id number and it and the name of the Mumps program to execute are sent to the server. Finally, each patient name and id having been entered into the listbox, the form is closed and a submit button placed on the screen.

Technical Notes

  1. Special Functions

    The collection of builin functions is noted in the files builtins.cpp and mumpsc/builtin.h. The file builtin.h has a list of initialized names in the char * array BuiltIn and a list of function headers. Normally, when Mumps calls a separately compiled function, all the user parameters are evaluated and only their values are passed. If a function name appears in mumpsc/builtin.h in the array BuiltIn[], global arrays will not be passed as the value of the array node reference but, rather as the array reference. For example, if the global array node is ^A(aples,oranges,pears) and if apples=10, oranges=20 and pears=30, and the value stored at ^A(10,20,30) is "fruit", a normal call to a separately compiled function:

    set i=$$^function(^A(apples,oranges,pears))

    would pass the value "fruit" to the function. If, however, the name of the function is in BuiltIn[], the same function call will pass a string whose contents are:

    ^A\x0110\x0120\x0130\x01

    where x\01 stands for a byte containing the value 1. This permits the global array reference itself, with to be passed however with indices evaluated. This is needed in order to permit access to some MDH functions which require the actual global array reference as oppossed to the value stored at the global array.

  2. Collating Sequences

    Normally, the collating sequence for global and local arrays will be ASCII. This is for efficiency reasons. Any technique which will cause the collating sequence to become that of "standard" Mumps will significantly increase overhead.

    There are two ways to change the collating sequence. One is to modify the comparison function used by the Berkeley DB (set_db_compare()). This function is called each time the data base system compares keys in the data base. The default function is an ASCII compare. Any other function can be used but they would all increase overhead on all data base activity.

    The other approach, which we have used here, is a bit of a kludge but it is functional and adds no significant overhead to processing non-numeric keys. This approach converts all numeric indices into a format of:

    sscanf(tmpf,"%lf",&x);
    sprintf(tmpf,GPADFMT,'\x1f',x);

    where "tmpf" initially contains the original character string of the numeric format. "GPADFMT" is "%c%+040.20lf" and this value is set in "sysparms.h". The net effect is to convert all numeric indices to 41 byte strings that can be compared with the ASCII comparison function. The first byte (x1f) causes all numeric indices to be collated before all non-numeric indices, as per the Mumps "standard". The second byte is set to "0" for negative numbers and "1" for positive numbers. Decoding routines in "global.a"c and "sym.c" reverse the process. The format controls "GPADFMT" and "GPADWIDTH" in "sysparms.h" control the size of the conversion of strings to numbers.

Index


Appendix B
GNU GENERAL PUBLIC LICENSE
Version 2, June 1991
Copyright (C) 1989, 1991 Free Software Foundation, Inc.
59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.

      Preamble

 The licenses for most software are designed to take away your
freedom to share and change it.  By contrast, the GNU General Public
License is intended to guarantee your freedom to share and change free
software--to make sure the software is free for all its users.  This
General Public License applies to most of the Free Software
Foundation's software and to any other program whose authors commit to
using it.  (Some other Free Software Foundation software is covered by
the GNU Library General Public License instead.)  You can apply it to
your programs, too.

 When we speak of free software, we are referring to freedom, not
price.  Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
this service if you wish), that you receive source code or can get it
if you want it, that you can change the software or use pieces of it
in new free programs; and that you know you can do these things.

 To protect your rights, we need to make restrictions that forbid
anyone to deny you these rights or to ask you to surrender the rights.
These restrictions translate to certain responsibilities for you if you
distribute copies of the software, or if you modify it.

 For example, if you distribute copies of such a program, whether
gratis or for a fee, you must give the recipients all the rights that
you have.  You must make sure that they, too, receive or can get the
source code.  And you must show them these terms so they know their
rights.

 We protect your rights with two steps: (1) copyright the software, and
(2) offer you this license which gives you legal permission to copy,
distribute and/or modify the software.

 Also, for each author's protection and ours, we want to make certain
that everyone understands that there is no warranty for this free
software.  If the software is modified by someone else and passed on, we
want its recipients to know that what they have is not the original, so
that any problems introduced by others will not reflect on the original
authors' reputations.

 Finally, any free program is threatened constantly by software
patents.  We wish to avoid the danger that redistributors of a free
program will individually obtain patent licenses, in effect making the
program proprietary.  To prevent this, we have made it clear that any
patent must be licensed for everyone's free use or not licensed at all.

 The precise terms and conditions for copying, distribution and
modification follow.
<>
     GNU GENERAL PUBLIC LICENSE
  TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION

 0. This License applies to any program or other work which contains
a notice placed by the copyright holder saying it may be distributed
under the terms of this General Public License.  The "Program", below,
refers to any such program or work, and a "work based on the Program"
means either the Program or any derivative work under copyright law:
that is to say, a work containing the Program or a portion of it,
either verbatim or with modifications and/or translated into another
language.  (Hereinafter, translation is included without limitation in
the term "modification".)  Each licensee is addressed as "you".

Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope.  The act of
running the Program is not restricted, and the output from the Program
is covered only if its contents constitute a work based on the
Program (independent of having been made by running the Program).
Whether that is true depends on what the Program does.

 1. You may copy and distribute verbatim copies of the Program's
source code as you receive it, in any medium, provided that you
conspicuously and appropriately publish on each copy an appropriate
copyright notice and disclaimer of warranty; keep intact all the
notices that refer to this License and to the absence of any warranty;
and give any other recipients of the Program a copy of this License
along with the Program.

You may charge a fee for the physical act of transferring a copy, and
you may at your option offer warranty protection in exchange for a fee.

 2. You may modify your copy or copies of the Program or any portion
of it, thus forming a work based on the Program, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:

   a) You must cause the modified files to carry prominent notices
   stating that you changed the files and the date of any change.

   b) You must cause any work that you distribute or publish, that in
   whole or in part contains or is derived from the Program or any
   part thereof, to be licensed as a whole at no charge to all third
   parties under the terms of this License.

   c) If the modified program normally reads commands interactively
   when run, you must cause it, when started running for such
   interactive use in the most ordinary way, to print or display an
   announcement including an appropriate copyright notice and a
   notice that there is no warranty (or else, saying that you provide
   a warranty) and that users may redistribute the program under
   these conditions, and telling the user how to view a copy of this
   License.  (Exception: if the Program itself is interactive but
   does not normally print such an announcement, your work based on
   the Program is not required to print an announcement.)
<>
These requirements apply to the modified work as a whole.  If
identifiable sections of that work are not derived from the Program,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works.  But when you
distribute the same sections as part of a whole which is a work based
on the Program, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote it.

Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Program.

In addition, mere aggregation of another work not based on the Program
with the Program (or with a work based on the Program) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.

 3. You may copy and distribute the Program (or a work based on it,
under Section 2) in object code or executable form under the terms of
Sections 1 and 2 above provided that you also do one of the following:

   a) Accompany it with the complete corresponding machine-readable
   source code, which must be distributed under the terms of Sections
   1 and 2 above on a medium customarily used for software interchange; or,

   b) Accompany it with a written offer, valid for at least three
   years, to give any third party, for a charge no more than your
   cost of physically performing source distribution, a complete
   machine-readable copy of the corresponding source code, to be
   distributed under the terms of Sections 1 and 2 above on a medium
   customarily used for software interchange; or,

   c) Accompany it with the information you received as to the offer
   to distribute corresponding source code.  (This alternative is
   allowed only for noncommercial distribution and only if you
   received the program in object code or executable form with such
   an offer, in accord with Subsection b above.)

The source code for a work means the preferred form of the work for
making modifications to it.  For an executable work, complete source
code means all the source code for all modules it contains, plus any
associated interface definition files, plus the scripts used to
control compilation and installation of the executable.  However, as a
special exception, the source code distributed need not include
anything that is normally distributed (in either source or binary
form) with the major components (compiler, kernel, and so on) of the
operating system on which the executable runs, unless that component
itself accompanies the executable.

If distribution of executable or object code is made by offering
access to copy from a designated place, then offering equivalent
access to copy the source code from the same place counts as
distribution of the source code, even though third parties are not
compelled to copy the source along with the object code.
<>
 4. You may not copy, modify, sublicense, or distribute the Program
except as expressly provided under this License.  Any attempt
otherwise to copy, modify, sublicense or distribute the Program is
void, and will automatically terminate your rights under this License.
However, parties who have received copies, or rights, from you under
this License will not have their licenses terminated so long as such
parties remain in full compliance.

 5. You are not required to accept this License, since you have not
signed it.  However, nothing else grants you permission to modify or
distribute the Program or its derivative works.  These actions are
prohibited by law if you do not accept this License.  Therefore, by
modifying or distributing the Program (or any work based on the
Program), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Program or works based on it.

 6. Each time you redistribute the Program (or any work based on the
Program), the recipient automatically receives a license from the
original licensor to copy, distribute or modify the Program subject to
these terms and conditions.  You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties to
this License.

 7. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License.  If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Program at all.  For example, if a patent
license would not permit royalty-free redistribution of the Program by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Program.

If any portion of this section is held invalid or unenforceable under
any particular circumstance, the balance of the section is intended to
apply and the section as a whole is intended to apply in other
circumstances.

It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system, which is
implemented by public license practices.  Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.

This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.
<>
 8. If the distribution and/or use of the Program is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Program under this License
may add an explicit geographical distribution limitation excluding
those countries, so that distribution is permitted only in or among
countries not thus excluded.  In such case, this License incorporates
the limitation as if written in the body of this License.

 9. The Free Software Foundation may publish revised and/or new versions
of the General Public License from time to time.  Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.

Each version is given a distinguishing version number.  If the Program
specifies a version number of this License which applies to it and "any
later version", you have the option of following the terms and conditions
either of that version or of any later version published by the Free
Software Foundation.  If the Program does not specify a version number of
this License, you may choose any version ever published by the Free Software
Foundation.

 10. If you wish to incorporate parts of the Program into other free
programs whose distribution conditions are different, write to the author
to ask for permission.  For software which is copyrighted by the Free
Software Foundation, write to the Free Software Foundation; we sometimes
make exceptions for this.  Our decision will be guided by the two goals
of preserving the free status of all derivatives of our free software and
of promoting the sharing and reuse of software generally.

      NO WARRANTY

 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.  EXCEPT WHEN
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK AS
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.  SHOULD THE
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
REPAIR OR CORRECTION.

 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.

END OF TERMS AND CONDITIONS
<>
How to Apply These Terms to Your New Programs

 If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.

 To do so, attach the following notices to the program.  It is safest
to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.

   <one line to give the program's name and a brief idea of what it does.>
   Copyright (C) 19yy  <name of author>

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; either version 2 of the License, or
   (at your option) any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program; if not, write to the Free Software
   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA


Also add information on how to contact you by electronic and paper mail.

If the program is interactive, make it output a short notice like this
when it starts in an interactive mode:

   Gnomovision version 69, Copyright (C) 19yy name of author
   Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
   This is free software, and you are welcome to redistribute it
   under certain conditions; type `show c' for details.

The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License.  Of course, the commands you use may
be called something other than `show w' and `show c'; they could even be
mouse-clicks or menu items--whatever suits your program.

You should also get your employer (if you work as a programmer) or your
school, if any, to sign a "copyright disclaimer" for the program, if
necessary.  Here is a sample; alter the names:

 Yoyodyne, Inc., hereby disclaims all copyright interest in the program
 `Gnomovision' (which makes passes at compilers) written by James Hacker.

 <signature of Ty Coon>, 1 April 1989
 Ty Coon, President of Vice

This General Public License does not permit incorporating your program into
proprietary programs.  If your program is a subroutine library, you may
consider it more useful to permit linking proprietary applications with the
library.  If this is what you want to do, use the GNU Library General
Public License instead of this License.

GNU Free Documentation License
Version 1.1, March 2000

 Copyright (C) 2000  Free Software Foundation, Inc.
     59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 Everyone is permitted to copy and distribute verbatim copies
 of this license document, but changing it is not allowed.


0. PREAMBLE

The purpose of this License is to make a manual, textbook, or other
written document "free" in the sense of freedom: to assure everyone
the effective freedom to copy and redistribute it, with or without
modifying it, either commercially or noncommercially.  Secondarily,
this License preserves for the author and publisher a way to get
credit for their work, while not being considered responsible for
modifications made by others.

This License is a kind of "copyleft", which means that derivative
works of the document must themselves be free in the same sense.  It
complements the GNU General Public License, which is a copyleft
license designed for free software.

We have designed this License in order to use it for manuals for free
software, because free software needs free documentation: a free
program should come with manuals providing the same freedoms that the
software does.  But this License is not limited to software manuals;
it can be used for any textual work, regardless of subject matter or
whether it is published as a printed book.  We recommend this License
principally for works whose purpose is instruction or reference.


1. APPLICABILITY AND DEFINITIONS

This License applies to any manual or other work that contains a
notice placed by the copyright holder saying it can be distributed
under the terms of this License.  The "Document", below, refers to any
such manual or work.  Any member of the public is a licensee, and is
addressed as "you".

A "Modified Version" of the Document means any work containing the
Document or a portion of it, either copied verbatim, or with
modifications and/or translated into another language.

A "Secondary Section" is a named appendix or a front-matter section of
the Document that deals exclusively with the relationship of the
publishers or authors of the Document to the Document's overall subject
(or to related matters) and contains nothing that could fall directly
within that overall subject.  (For example, if the Document is in part a
textbook of mathematics, a Secondary Section may not explain any
mathematics.)  The relationship could be a matter of historical
connection with the subject or with related matters, or of legal,
commercial, philosophical, ethical or political position regarding
them.

The "Invariant Sections" are certain Secondary Sections whose titles
are designated, as being those of Invariant Sections, in the notice
that says that the Document is released under this License.

The "Cover Texts" are certain short passages of text that are listed,
as Front-Cover Texts or Back-Cover Texts, in the notice that says that
the Document is released under this License.

A "Transparent" copy of the Document means a machine-readable copy,
represented in a format whose specification is available to the
general public, whose contents can be viewed and edited directly and
straightforwardly with generic text editors or (for images composed of
pixels) generic paint programs or (for drawings) some widely available
drawing editor, and that is suitable for input to text formatters or
for automatic translation to a variety of formats suitable for input
to text formatters.  A copy made in an otherwise Transparent file
format whose markup has been designed to thwart or discourage
subsequent modification by readers is not Transparent.  A copy that is
not "Transparent" is called "Opaque".

Examples of suitable formats for Transparent copies include plain
ASCII without markup, Texinfo input format, LaTeX input format, SGML
or XML using a publicly available DTD, and standard-conforming simple
HTML designed for human modification.  Opaque formats include
PostScript, PDF, proprietary formats that can be read and edited only
by proprietary word processors, SGML or XML for which the DTD and/or
processing tools are not generally available, and the
machine-generated HTML produced by some word processors for output
purposes only.

The "Title Page" means, for a printed book, the title page itself,
plus such following pages as are needed to hold, legibly, the material
this License requires to appear in the title page.  For works in
formats which do not have any title page as such, "Title Page" means
the text near the most prominent appearance of the work's title,
preceding the beginning of the body of the text.


2. VERBATIM COPYING

You may copy and distribute the Document in any medium, either
commercially or noncommercially, provided that this License, the
copyright notices, and the license notice saying this License applies
to the Document are reproduced in all copies, and that you add no other
conditions whatsoever to those of this License.  You may not use
technical measures to obstruct or control the reading or further
copying of the copies you make or distribute.  However, you may accept
compensation in exchange for copies.  If you distribute a large enough
number of copies you must also follow the conditions in section 3.

You may also lend copies, under the same conditions stated above, and
you may publicly display copies.


3. COPYING IN QUANTITY

If you publish printed copies of the Document numbering more than 100,
and the Document's license notice requires Cover Texts, you must enclose
the copies in covers that carry, clearly and legibly, all these Cover
Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on
the back cover.  Both covers must also clearly and legibly identify
you as the publisher of these copies.  The front cover must present
the full title with all words of the title equally prominent and
visible.  You may add other material on the covers in addition.
Copying with changes limited to the covers, as long as they preserve
the title of the Document and satisfy these conditions, can be treated
as verbatim copying in other respects.

If the required texts for either cover are too voluminous to fit
legibly, you should put the first ones listed (as many as fit
reasonably) on the actual cover, and continue the rest onto adjacent
pages.

If you publish or distribute Opaque copies of the Document numbering
more than 100, you must either include a machine-readable Transparent
copy along with each Opaque copy, or state in or with each Opaque copy
a publicly-accessible computer-network location containing a complete
Transparent copy of the Document, free of added material, which the
general network-using public has access to download anonymously at no
charge using public-standard network protocols.  If you use the latter
option, you must take reasonably prudent steps, when you begin
distribution of Opaque copies in quantity, to ensure that this
Transparent copy will remain thus accessible at the stated location
until at least one year after the last time you distribute an Opaque
copy (directly or through your agents or retailers) of that edition to
the public.

It is requested, but not required, that you contact the authors of the
Document well before redistributing any large number of copies, to give
them a chance to provide you with an updated version of the Document.


4. MODIFICATIONS

You may copy and distribute a Modified Version of the Document under
the conditions of sections 2 and 3 above, provided that you release
the Modified Version under precisely this License, with the Modified
Version filling the role of the Document, thus licensing distribution
and modification of the Modified Version to whoever possesses a copy
of it.  In addition, you must do these things in the Modified Version:

A. Use in the Title Page (and on the covers, if any) a title distinct
   from that of the Document, and from those of previous versions
   (which should, if there were any, be listed in the History section
   of the Document).  You may use the same title as a previous version
   if the original publisher of that version gives permission.
B. List on the Title Page, as authors, one or more persons or entities
   responsible for authorship of the modifications in the Modified
   Version, together with at least five of the principal authors of the
   Document (all of its principal authors, if it has less than five).
C. State on the Title page the name of the publisher of the
   Modified Version, as the publisher.
D. Preserve all the copyright notices of the Document.
E. Add an appropriate copyright notice for your modifications
   adjacent to the other copyright notices.
F. Include, immediately after the copyright notices, a license notice
   giving the public permission to use the Modified Version under the
   terms of this License, in the form shown in the Addendum below.
G. Preserve in that license notice the full lists of Invariant Sections
   and required Cover Texts given in the Document's license notice.
H. Include an unaltered copy of this License.
I. Preserve the section entitled "History", and its title, and add to
   it an item stating at least the title, year, new authors, and
   publisher of the Modified Version as given on the Title Page.  If
   there is no section entitled "History" in the Document, create one
   stating the title, year, authors, and publisher of the Document as
   given on its Title Page, then add an item describing the Modified
   Version as stated in the previous sentence.
J. Preserve the network location, if any, given in the Document for
   public access to a Transparent copy of the Document, and likewise
   the network locations given in the Document for previous versions
   it was based on.  These may be placed in the "History" section.
   You may omit a network location for a work that was published at
   least four years before the Document itself, or if the original
   publisher of the version it refers to gives permission.
K. In any section entitled "Acknowledgements" or "Dedications",
   preserve the section's title, and preserve in the section all the
   substance and tone of each of the contributor acknowledgements
   and/or dedications given therein.
L. Preserve all the Invariant Sections of the Document,
   unaltered in their text and in their titles.  Section numbers
   or the equivalent are not considered part of the section titles.
M. Delete any section entitled "Endorsements".  Such a section
   may not be included in the Modified Version.
N. Do not retitle any existing section as "Endorsements"
   or to conflict in title with any Invariant Section.

If the Modified Version includes new front-matter sections or
appendices that qualify as Secondary Sections and contain no material
copied from the Document, you may at your option designate some or all
of these sections as invariant.  To do this, add their titles to the
list of Invariant Sections in the Modified Version's license notice.
These titles must be distinct from any other section titles.

You may add a section entitled "Endorsements", provided it contains
nothing but endorsements of your Modified Version by various
parties--for example, statements of peer review or that the text has
been approved by an organization as the authoritative definition of a
standard.

You may add a passage of up to five words as a Front-Cover Text, and a
passage of up to 25 words as a Back-Cover Text, to the end of the list
of Cover Texts in the Modified Version.  Only one passage of
Front-Cover Text and one of Back-Cover Text may be added by (or
through arrangements made by) any one entity.  If the Document already
includes a cover text for the same cover, previously added by you or
by arrangement made by the same entity you are acting on behalf of,
you may not add another; but you may replace the old one, on explicit
permission from the previous publisher that added the old one.

The author(s) and publisher(s) of the Document do not by this License
give permission to use their names for publicity for or to assert or
imply endorsement of any Modified Version.


5. COMBINING DOCUMENTS

You may combine the Document with other documents released under this
License, under the terms defined in section 4 above for modified
versions, provided that you include in the combination all of the
Invariant Sections of all of the original documents, unmodified, and
list them all as Invariant Sections of your combined work in its
license notice.

The combined work need only contain one copy of this License, and
multiple identical Invariant Sections may be replaced with a single
copy.  If there are multiple Invariant Sections with the same name but
different contents, make the title of each such section unique by
adding at the end of it, in parentheses, the name of the original
author or publisher of that section if known, or else a unique number.
Make the same adjustment to the section titles in the list of
Invariant Sections in the license notice of the combined work.

In the combination, you must combine any sections entitled "History"
in the various original documents, forming one section entitled
"History"; likewise combine any sections entitled "Acknowledgements",
and any sections entitled "Dedications".  You must delete all sections
entitled "Endorsements."


6. COLLECTIONS OF DOCUMENTS

You may make a collection consisting of the Document and other documents
released under this License, and replace the individual copies of this
License in the various documents with a single copy that is included in
the collection, provided that you follow the rules of this License for
verbatim copying of each of the documents in all other respects.

You may extract a single document from such a collection, and distribute
it individually under this License, provided you insert a copy of this
License into the extracted document, and follow this License in all
other respects regarding verbatim copying of that document.


7. AGGREGATION WITH INDEPENDENT WORKS

A compilation of the Document or its derivatives with other separate
and independent documents or works, in or on a volume of a storage or
distribution medium, does not as a whole count as a Modified Version
of the Document, provided no compilation copyright is claimed for the
compilation.  Such a compilation is called an "aggregate", and this
License does not apply to the other self-contained works thus compiled
with the Document, on account of their being thus compiled, if they
are not themselves derivative works of the Document.

If the Cover Text requirement of section 3 is applicable to these
copies of the Document, then if the Document is less than one quarter
of the entire aggregate, the Document's Cover Texts may be placed on
covers that surround only the Document within the aggregate.
Otherwise they must appear on covers around the whole aggregate.


8. TRANSLATION

Translation is considered a kind of modification, so you may
distribute translations of the Document under the terms of section 4.
Replacing Invariant Sections with translations requires special
permission from their copyright holders, but you may include
translations of some or all Invariant Sections in addition to the
original versions of these Invariant Sections.  You may include a
translation of this License provided that you also include the
original English version of this License.  In case of a disagreement
between the translation and the original English version of this
License, the original English version will prevail.


9. TERMINATION

You may not copy, modify, sublicense, or distribute the Document except
as expressly provided for under this License.  Any other attempt to
copy, modify, sublicense or distribute the Document is void, and will
automatically terminate your rights under this License.  However,
parties who have received copies, or rights, from you under this
License will not have their licenses terminated so long as such
parties remain in full compliance.


10. FUTURE REVISIONS OF THIS LICENSE

The Free Software Foundation may publish new, revised versions
of the GNU Free Documentation License from time to time.  Such new
versions will be similar in spirit to the present version, but may
differ in detail to address new problems or concerns.  See
http://www.gnu.org/copyleft/.

Each version of the License is given a distinguishing version number.
If the Document specifies that a particular numbered version of this
License "or any later version" applies to it, you have the option of
following the terms and conditions either of that specified version or
of any later version that has been published (not as a draft) by the
Free Software Foundation.  If the Document does not specify a version
number of this License, you may choose any version ever published (not
as a draft) by the Free Software Foundation.


ADDENDUM: How to use this License for your documents

To use this License in a document you have written, include a copy of
the License in the document and put the following copyright and
license notices just after the title page:

      Copyright (c)  YEAR  YOUR NAME.
      Permission is granted to copy, distribute and/or modify this document
      under the terms of the GNU Free Documentation License, Version 1.1
      or any later version published by the Free Software Foundation;
      with the Invariant Sections being LIST THEIR TITLES, with the
      Front-Cover Texts being LIST, and with the Back-Cover Texts being LIST.
      A copy of the license is included in the section entitled "GNU
      Free Documentation License".

If you have no Invariant Sections, write "with no Invariant Sections"
instead of saying which ones are invariant.  If you have no
Front-Cover Texts, write "no Front-Cover Texts" instead of
"Front-Cover Texts being LIST"; likewise for Back-Cover Texts.

If your document contains nontrivial examples of program code, we
recommend releasing these examples in parallel under your choice of
free software license, such as the GNU General Public License,
to permit their use in free software.

--------------------------------------------------------------------------------

GNU LESSER GENERAL PUBLIC LICENSE
Version 2.1, February 1999

 Copyright (C) 1991, 1999 Free Software Foundation, Inc.
     59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 Everyone is permitted to copy and distribute verbatim copies
 of this license document, but changing it is not allowed.

[This is the first released version of the Lesser GPL.  It also counts
 as the successor of the GNU Library Public License, version 2, hence
 the version number 2.1.]

Preamble

  The licenses for most software are designed to take away your
freedom to share and change it.  By contrast, the GNU General Public
Licenses are intended to guarantee your freedom to share and change
free software--to make sure the software is free for all its users.

  This license, the Lesser General Public License, applies to some
specially designated software packages--typically libraries--of the
Free Software Foundation and other authors who decide to use it.  You
can use it too, but we suggest you first think carefully about whether
this license or the ordinary General Public License is the better
strategy to use in any particular case, based on the explanations below.

  When we speak of free software, we are referring to freedom of use,
not price.  Our General Public Licenses are designed to make sure that
you have the freedom to distribute copies of free software (and charge
for this service if you wish); that you receive source code or can get
it if you want it; that you can change the software and use pieces of
it in new free programs; and that you are informed that you can do
these things.

  To protect your rights, we need to make restrictions that forbid
distributors to deny you these rights or to ask you to surrender these
rights.  These restrictions translate to certain responsibilities for
you if you distribute copies of the library or if you modify it.

  For example, if you distribute copies of the library, whether gratis
or for a fee, you must give the recipients all the rights that we gave
you.  You must make sure that they, too, receive or can get the source
code.  If you link other code with the library, you must provide
complete object files to the recipients, so that they can relink them
with the library after making changes to the library and recompiling
it.  And you must show them these terms so they know their rights.

  We protect your rights with a two-step method: (1) we copyright the
library, and (2) we offer you this license, which gives you legal
permission to copy, distribute and/or modify the library.

  To protect each distributor, we want to make it very clear that
there is no warranty for the free library.  Also, if the library is
modified by someone else and passed on, the recipients should know
that what they have is not the original version, so that the original
author's reputation will not be affected by problems that might be
introduced by others.

  Finally, software patents pose a constant threat to the existence of
any free program.  We wish to make sure that a company cannot
effectively restrict the users of a free program by obtaining a
restrictive license from a patent holder.  Therefore, we insist that
any patent license obtained for a version of the library must be
consistent with the full freedom of use specified in this license.

  Most GNU software, including some libraries, is covered by the
ordinary GNU General Public License.  This license, the GNU Lesser
General Public License, applies to certain designated libraries, and
is quite different from the ordinary General Public License.  We use
this license for certain libraries in order to permit linking those
libraries into non-free programs.

  When a program is linked with a library, whether statically or using
a shared library, the combination of the two is legally speaking a
combined work, a derivative of the original library.  The ordinary
General Public License therefore permits such linking only if the
entire combination fits its criteria of freedom.  The Lesser General
Public License permits more lax criteria for linking other code with
the library.

  We call this license the "Lesser" General Public License because it
does Less to protect the user's freedom than the ordinary General
Public License.  It also provides other free software developers Less
of an advantage over competing non-free programs.  These disadvantages
are the reason we use the ordinary General Public License for many
libraries.  However, the Lesser license provides advantages in certain
special circumstances.

  For example, on rare occasions, there may be a special need to
encourage the widest possible use of a certain library, so that it becomes
a de-facto standard.  To achieve this, non-free programs must be
allowed to use the library.  A more frequent case is that a free
library does the same job as widely used non-free libraries.  In this
case, there is little to gain by limiting the free library to free
software only, so we use the Lesser General Public License.

  In other cases, permission to use a particular library in non-free
programs enables a greater number of people to use a large body of
free software.  For example, permission to use the GNU C Library in
non-free programs enables many more people to use the whole GNU
operating system, as well as its variant, the GNU/Linux operating
system.

  Although the Lesser General Public License is Less protective of the
users' freedom, it does ensure that the user of a program that is
linked with the Library has the freedom and the wherewithal to run
that program using a modified version of the Library.

  The precise terms and conditions for copying, distribution and
modification follow.  Pay close attention to the difference between a
"work based on the library" and a "work that uses the library".  The
former contains code derived from the library, whereas the latter must
be combined with the library in order to run.

GNU LESSER GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION

  0. This License Agreement applies to any software library or other
program which contains a notice placed by the copyright holder or
other authorized party saying it may be distributed under the terms of
this Lesser General Public License (also called "this License").
Each licensee is addressed as "you".

  A "library" means a collection of software functions and/or data
prepared so as to be conveniently linked with application programs
(which use some of those functions and data) to form executables.

  The "Library", below, refers to any such software library or work
which has been distributed under these terms.  A "work based on the
Library" means either the Library or any derivative work under
copyright law: that is to say, a work containing the Library or a
portion of it, either verbatim or with modifications and/or translated
straightforwardly into another language.  (Hereinafter, translation is
included without limitation in the term "modification".)

  "Source code" for a work means the preferred form of the work for
making modifications to it.  For a library, complete source code means
all the source code for all modules it contains, plus any associated
interface definition files, plus the scripts used to control compilation
and installation of the library.

  Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope.  The act of
running a program using the Library is not restricted, and output from
such a program is covered only if its contents constitute a work based
on the Library (independent of the use of the Library in a tool for
writing it).  Whether that is true depends on what the Library does
and what the program that uses the Library does.
  
  1. You may copy and distribute verbatim copies of the Library's
complete source code as you receive it, in any medium, provided that
you conspicuously and appropriately publish on each copy an
appropriate copyright notice and disclaimer of warranty; keep intact
all the notices that refer to this License and to the absence of any
warranty; and distribute a copy of this License along with the
Library.

  You may charge a fee for the physical act of transferring a copy,
and you may at your option offer warranty protection in exchange for a
fee.

  2. You may modify your copy or copies of the Library or any portion
of it, thus forming a work based on the Library, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:

    a) The modified work must itself be a software library.

    b) You must cause the files modified to carry prominent notices
    stating that you changed the files and the date of any change.

    c) You must cause the whole of the work to be licensed at no
    charge to all third parties under the terms of this License.

    d) If a facility in the modified Library refers to a function or a
    table of data to be supplied by an application program that uses
    the facility, other than as an argument passed when the facility
    is invoked, then you must make a good faith effort to ensure that,
    in the event an application does not supply such function or
    table, the facility still operates, and performs whatever part of
    its purpose remains meaningful.

    (For example, a function in a library to compute square roots has
    a purpose that is entirely well-defined independent of the
    application.  Therefore, Subsection 2d requires that any
    application-supplied function or table used by this function must
    be optional: if the application does not supply it, the square
    root function must still compute square roots.)

These requirements apply to the modified work as a whole.  If
identifiable sections of that work are not derived from the Library,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works.  But when you
distribute the same sections as part of a whole which is a work based
on the Library, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote
it.

Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Library.

In addition, mere aggregation of another work not based on the Library
with the Library (or with a work based on the Library) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.

  3. You may opt to apply the terms of the ordinary GNU General Public
License instead of this License to a given copy of the Library.  To do
this, you must alter all the notices that refer to this License, so
that they refer to the ordinary GNU General Public License, version 2,
instead of to this License.  (If a newer version than version 2 of the
ordinary GNU General Public License has appeared, then you can specify
that version instead if you wish.)  Do not make any other change in
these notices.

  Once this change is made in a given copy, it is irreversible for
that copy, so the ordinary GNU General Public License applies to all
subsequent copies and derivative works made from that copy.

  This option is useful when you wish to copy part of the code of
the Library into a program that is not a library.

  4. You may copy and distribute the Library (or a portion or
derivative of it, under Section 2) in object code or executable form
under the terms of Sections 1 and 2 above provided that you accompany
it with the complete corresponding machine-readable source code, which
must be distributed under the terms of Sections 1 and 2 above on a
medium customarily used for software interchange.

  If distribution of object code is made by offering access to copy
from a designated place, then offering equivalent access to copy the
source code from the same place satisfies the requirement to
distribute the source code, even though third parties are not
compelled to copy the source along with the object code.

  5. A program that contains no derivative of any portion of the
Library, but is designed to work with the Library by being compiled or
linked with it, is called a "work that uses the Library".  Such a
work, in isolation, is not a derivative work of the Library, and
therefore falls outside the scope of this License.

  However, linking a "work that uses the Library" with the Library
creates an executable that is a derivative of the Library (because it
contains portions of the Library), rather than a "work that uses the
library".  The executable is therefore covered by this License.
Section 6 states terms for distribution of such executables.

  When a "work that uses the Library" uses material from a header file
that is part of the Library, the object code for the work may be a
derivative work of the Library even though the source code is not.
Whether this is true is especially significant if the work can be
linked without the Library, or if the work is itself a library.  The
threshold for this to be true is not precisely defined by law.

  If such an object file uses only numerical parameters, data
structure layouts and accessors, and small macros and small inline
functions (ten lines or less in length), then the use of the object
file is unrestricted, regardless of whether it is legally a derivative
work.  (Executables containing this object code plus portions of the
Library will still fall under Section 6.)

  Otherwise, if the work is a derivative of the Library, you may
distribute the object code for the work under the terms of Section 6.
Any executables containing that work also fall under Section 6,
whether or not they are linked directly with the Library itself.

  6. As an exception to the Sections above, you may also combine or
link a "work that uses the Library" with the Library to produce a
work containing portions of the Library, and distribute that work
under terms of your choice, provided that the terms permit
modification of the work for the customer's own use and reverse
engineering for debugging such modifications.

  You must give prominent notice with each copy of the work that the
Library is used in it and that the Library and its use are covered by
this License.  You must supply a copy of this License.  If the work
during execution displays copyright notices, you must include the
copyright notice for the Library among them, as well as a reference
directing the user to the copy of this License.  Also, you must do one
of these things:

    a) Accompany the work with the complete corresponding
    machine-readable source code for the Library including whatever
    changes were used in the work (which must be distributed under
    Sections 1 and 2 above); and, if the work is an executable linked
    with the Library, with the complete machine-readable "work that
    uses the Library", as object code and/or source code, so that the
    user can modify the Library and then relink to produce a modified
    executable containing the modified Library.  (It is understood
    that the user who changes the contents of definitions files in the
    Library will not necessarily be able to recompile the application
    to use the modified definitions.)

    b) Use a suitable shared library mechanism for linking with the
    Library.  A suitable mechanism is one that (1) uses at run time a
    copy of the library already present on the user's computer system,
    rather than copying library functions into the executable, and (2)
    will operate properly with a modified version of the library, if
    the user installs one, as long as the modified version is
    interface-compatible with the version that the work was made with.

    c) Accompany the work with a written offer, valid for at
    least three years, to give the same user the materials
    specified in Subsection 6a, above, for a charge no more
    than the cost of performing this distribution.

    d) If distribution of the work is made by offering access to copy
    from a designated place, offer equivalent access to copy the above
    specified materials from the same place.

    e) Verify that the user has already received a copy of these
    materials or that you have already sent this user a copy.

  For an executable, the required form of the "work that uses the
Library" must include any data and utility programs needed for
reproducing the executable from it.  However, as a special exception,
the materials to be distributed need not include anything that is
normally distributed (in either source or binary form) with the major
components (compiler, kernel, and so on) of the operating system on
which the executable runs, unless that component itself accompanies
the executable.

  It may happen that this requirement contradicts the license
restrictions of other proprietary libraries that do not normally
accompany the operating system.  Such a contradiction means you cannot
use both them and the Library together in an executable that you
distribute.

  7. You may place library facilities that are a work based on the
Library side-by-side in a single library together with other library
facilities not covered by this License, and distribute such a combined
library, provided that the separate distribution of the work based on
the Library and of the other library facilities is otherwise
permitted, and provided that you do these two things:

    a) Accompany the combined library with a copy of the same work
    based on the Library, uncombined with any other library
    facilities.  This must be distributed under the terms of the
    Sections above.

    b) Give prominent notice with the combined library of the fact
    that part of it is a work based on the Library, and explaining
    where to find the accompanying uncombined form of the same work.

  8. You may not copy, modify, sublicense, link with, or distribute
the Library except as expressly provided under this License.  Any
attempt otherwise to copy, modify, sublicense, link with, or
distribute the Library is void, and will automatically terminate your
rights under this License.  However, parties who have received copies,
or rights, from you under this License will not have their licenses
terminated so long as such parties remain in full compliance.

  9. You are not required to accept this License, since you have not
signed it.  However, nothing else grants you permission to modify or
distribute the Library or its derivative works.  These actions are
prohibited by law if you do not accept this License.  Therefore, by
modifying or distributing the Library (or any work based on the
Library), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Library or works based on it.

  10. Each time you redistribute the Library (or any work based on the
Library), the recipient automatically receives a license from the
original licensor to copy, distribute, link with or modify the Library
subject to these terms and conditions.  You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties with
this License.

  11. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License.  If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Library at all.  For example, if a patent
license would not permit royalty-free redistribution of the Library by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Library.

If any portion of this section is held invalid or unenforceable under any
particular circumstance, the balance of the section is intended to apply,
and the section as a whole is intended to apply in other circumstances.

It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system which is
implemented by public license practices.  Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.

This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.

  12. If the distribution and/or use of the Library is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Library under this License may add
an explicit geographical distribution limitation excluding those countries,
so that distribution is permitted only in or among countries not thus
excluded.  In such case, this License incorporates the limitation as if
written in the body of this License.

  13. The Free Software Foundation may publish revised and/or new
versions of the Lesser General Public License from time to time.
Such new versions will be similar in spirit to the present version,
but may differ in detail to address new problems or concerns.

Each version is given a distinguishing version number.  If the Library
specifies a version number of this License which applies to it and
"any later version", you have the option of following the terms and
conditions either of that version or of any later version published by
the Free Software Foundation.  If the Library does not specify a
license version number, you may choose any version ever published by
the Free Software Foundation.

  14. If you wish to incorporate parts of the Library into other free
programs whose distribution conditions are incompatible with these,
write to the author to ask for permission.  For software which is
copyrighted by the Free Software Foundation, write to the Free
Software Foundation; we sometimes make exceptions for this.  Our
decision will be guided by the two goals of preserving the free status
of all derivatives of our free software and of promoting the sharing
and reuse of software generally.

NO WARRANTY

  15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO
WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW.
EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR
OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY
KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE.  THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE
LIBRARY IS WITH YOU.  SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME
THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.

  16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY
AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU
FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR
CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE
LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING
RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A
FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF
SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH
DAMAGES.

END OF TERMS AND CONDITIONS

How to Apply These Terms to Your New Libraries

  If you develop a new library, and you want it to be of the greatest
possible use to the public, we recommend making it free software that
everyone can redistribute and change.  You can do so by permitting
redistribution under these terms (or, alternatively, under the terms of the
ordinary General Public License).

  To apply these terms, attach the following notices to the library.  It is
safest to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least the
"copyright" line and a pointer to where the full notice is found.

    
    Copyright (C)   

    This library is free software; you can redistribute it and/or
    modify it under the terms of the GNU Lesser General Public
    License as published by the Free Software Foundation; either
    version 2.1 of the License, or (at your option) any later version.

    This library is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
    Lesser General Public License for more details.

    You should have received a copy of the GNU Lesser General Public
    License along with this library; if not, write to the Free Software
    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA

Also add information on how to contact you by electronic and paper mail.

You should also get your employer (if you work as a programmer) or your
school, if any, to sign a "copyright disclaimer" for the library, if
necessary.  Here is a sample; alter the names:

  Yoyodyne, Inc., hereby disclaims all copyright interest in the
  library `Frob' (a library for tweaking knobs) written by James Random Hacker.

  , 1 April 1990
  Ty Coon, President of Vice

That's all there is to it!





Appendix D

Mumps 95 Pattern Matching

Author: Matthew Lockner

Mumps 95 compliant pattern matching (the '?' operator) is implemented in this compiler as given by the following grammar:

 pattern         ::= {pattern_atom}
 pattern_atom    ::= count pattern_element
 count           ::= int | '.' | '.' int
                   | int '.' | int '.' int
 pattern_element ::= pattern_code {pattern_code} | string | alternation
 pattern_code    ::= 'A' | 'C' | 'E' | 'L' | 'N' | 'P' | 'U'
 alternation     ::= '(' pattern_atom {',' pattern_atom} ')'

The largest difference between the current and previous standard is the introduction of the alternation construct, an extension that works as in other popular regular expressions implementations. It allows for one of many possible pattern fragments to match a given portion of subject text.

A string literal must be quoted. Also note that alternations are only allowed to contain pattern atoms and not full patterns; while this is a possible shortcoming, it is in accordance with the standard. It is a trivial matter to extend alternations to the ability to contain full patterns, and this may be implemented upon sufficient demand.

Pattern matching is supported by the Perl-Compatible Regular Expressions library (PCRE). Mumps patterns are translated via a recursive-descent parser in the Mumps library into a form consistent with Perl regular expressions, where PCRE then does the actual work of matching. Internally, much of this translation is simple character-level transliteration (substituting '|' for the comma in alternation lists, for example). Pattern code sequences are supported using the POSIX character classes supported in PCRE and are mostly intuitive, with the possible exception of 'E', which is substituted with [[:print][:cntrl:]]. Currently, this construct should cover the ASCII 7-bit character set (lower ASCII).

Due to the heavy string-handling requirements of the pattern translation process, this module uses a separate set of string-handling functions built on top of the C standard string functions, using no dynamic memory allocation and fixed-length buffers for all operations whose length is given by the constant STR_MAX in sysparms.h. If an operation overflows during the execution of a Mumps compiled binary, a diagnostic is output to stderr and the program terminates. If such termination occurs too frequently, simply increase the value of STR_MAX.


Appendix E

Using Perl Regular Expressions

Author: Matthew Lockner

In addition to Mumps 95 pattern matching using the '?' operator, it is also possible to perform pattern matching against Perl regular expressions via the perlmatch function. Support for this functionality is provided by the Perl-Compatible Regular Expressions library (PCRE), which supports a majority of the functionality found in Perl's regular expression engine.

The perlmatch function works in a somewhat similar fashion to the '?' operator. It is provided with a subject string and a Perl pattern against which to match the subject. The result of the function is boolean and may be used in boolean expression contexts such as the "If" statement.

Some subtleties that differ significantly from Mumps pattern matching should be noted:
  1. A Mumps match expects that the pattern will match against the entire subject string, in that successful matching implies that no characters are left unmatched even if the pattern matched against an initial segment of the subject string. Using perlmatch, it is sufficient that the entire Perl pattern matches an initial segment of the subject string to return a successful match.

  2. The perlmatch function has the side effect of creating variables in the local symbol table to hold backreferences, the equivalent concept of $1, $2, $3, ... in Perl. Up to nine backreferences are currently supported, and can be accessed through the same naming scheme as Perl ($1 through $9). These variables remain defined up to a subsequent call to perlmatch , at which point they are replaced by the backreferences captured from that invocation. Undefined backreferences are cleared between invocations; that is, if a match operation captured five backreferences, then $6 through $9 will contain the null string.

Examples

This program asks the user to input a telephone number. If the data entered looks like a valid telephone number, it extracts and prints the area code portion using a backreference; otherwise, it prints a failure message and exits.

   Zmain

   Write "Please enter a telephone number:",!
   Read phonenum

   If $$^perlmatch(phonenum,"^(1-)?(\(?\d{3}\)?)?(-| )?\d{3}-?\d{4}$") Do
   . Write "+++ This looks like a phone number.",!
   . Write "The area code is: ",$2,!
   Else  Do
   . Write "--- This didn't look like a phone number.",!

   Halt

The output of several sample runs of the program follows:

Please enter a telephone number:
1-123-555-4567
+++ This looks like a phone number.
The area code is: 123


Please enter a telephone number:
(123)-555-1234
+++ This looks like a phone number.
The area code is: (123)


Please enter a telephone number:
(123) 555-0987
+++ This looks like a phone number.
The area code is: (123)

As in Perl, sections of the regular expression contained in parentheses define what is contained in the backreferences following a match operation. The backreference variables are named in a left-to-right order with respect to the expression, meaning that $1 is assigned the portion matched against the leftmost parenthesized section of the regular expression, with further references assigned names in increasing order. For a much more in-depth treatment of the subject of Perl regular expressions, refer to the perlre manpage distributed with the Perl language (also widely available online).


Appendix G

Manual Pages

  1. mumpsc
    
    mumpsc(1)						mumpsc(1)
    
    
    NAME
           mumpsc - Mumps compiler
    
    
    SYNOPSIS
           mumpsc				 [-n|--native|-b|--berke­
           ley|-s[T|U|S]|--server=[udp|tcp|ssl]]	 
           [-g|--debug] [-d<IP>|--default_ip=<IP>] filename
    
    
    DESCRIPTION
           This manual page explains the mumpsc program. This program
           translates a source file written	 in  the  Mumps	 language
           into  a C source file, which is then compiled into an exe­
           cutable.
    
           mumpsc is actually a shell script and wrapper to	 mumps2c,
           the  Mumps  to  C translator. It also calls cc if a source
           file is successfully generated.
    
           "filename" is of the form "progname.mps" followed by zero or
           more "progs.o".  The ".mps" extension is required.
    
    
    OPTIONS
           -n [ --native ]
    	      Use the native database implementation  for  global
    	      array handling.
    
           -b [ --berkeley ]
    	      Use  Berkeley  DB	 to  manage global array storage.
    	      (This is the default)
    
    
    
           -g [ --debug ]
    	      Compile the generated C code  with  debugging  sym­
    	      bols,  to	 allow	for  the use of a debugger on the
    	      resulting binaries.
    
    
    
    
    BUGS
           Please  e-mail  bug  reports  to	 the  authors. You should
           include a Mumps source file that can reproduce the  error,
           and  the	 version  number (found in the name of the source
           tarball the  compiler is installed from).
    
    
    AUTHOR
           Most of the source code for this Mumps compiler was  writ­
           ten by Kevin O'Kane, <okane@cs.uni.edu>, at the University
           of Northern Iowa.   The	Berkeley-DB  based  global  array
           library	 was   written	 by   Matthew	Lockner,   <lock­
           ner@cns.uni.edu>.  
    
    
    
    SEE ALSO
           The full documentation of this compiler, including a  full
           reference  on the Mumps language and the dialect supported
           by this compiler, may be found in
    
           /usr/share/doc/mumpsc/compiler.html
    
    
    
           Also  in	 that  area  are included a few sample Mumps pro­
           grams.
    
    MUMPSC			 January 8, 2002			2
    


Appendix H

Perl Compatible Regular Expression Library

Compiled Mumps programs may call upon the Perl Compatible Regular Expression Library. In some cases, this library is distributed with the Mumps Compiler. The following is the PCRE license:

PCRE LICENSE
------------

PCRE is a library of functions to support regular expressions whose syntax
and semantics are as close as possible to those of the Perl 5 language.

Written by: Philip Hazel 

University of Cambridge Computing Service,
Cambridge, England. Phone: +44 1223 334714.

Copyright (c) 1997-2001 University of Cambridge

Permission is granted to anyone to use this software for any purpose on any
computer system, and to redistribute it freely, subject to the following
restrictions:

1. This software is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

2. The origin of this software must not be misrepresented, either by
   explicit claim or by omission. In practice, this means that if you use
   PCRE in software which you distribute to others, commercially or
   otherwise, you must put a sentence like this

     Regular expression support is provided by the PCRE library package,
     which is open source software, written by Philip Hazel, and copyright
     by the University of Cambridge, England.

   somewhere reasonably visible in your documentation and in any relevant
   files or online help data or similar. A reference to the ftp site for
   the source, that is, to

     ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/

   should also be given in the documentation.

3. Altered versions must be plainly marked as such, and must not be
   misrepresented as being the original software.

4. If PCRE is embedded in any software that is released under the GNU
   General Purpose Licence (GPL), or Lesser General Purpose Licence (LGPL),
   then the terms of that licence shall supersede any condition above with
   which it is incompatible.

The documentation for PCRE, supplied in the "doc" directory, is distributed
under the same terms as the software itself.

End

Appendix I

Apple Mac OS/X Install Instructions

INSTALLING MUMPS ON MAC OS 10.2
Chris Johnson (chris.johnson@myrealbox.com)

Mumps can now be installed on OS X, thanks to Apple's migration to its
BSD-like core operating system called Darwin.  A typical installation
of OS X does not provide many of the standard development packages,
but the instructions below will help you set up an environment suitable
for most Mumps programs.

Mumps for OS X does not at this time handle network connections.  If you
do not have OpenSSL installed, you may ignore any SSL errors encountered
while installing Mumps.

Apart from installing missing software, installing Mumps on OS X should
differ little from installing Mumps on other Unix-family machines.  Please
consult INSTALL for other configuration directions.

1. Make sure you have Mac OS X installed with the BSD subsystem.  Mumps
   is designed to run on the Unix family of operating systems, and access
   to a command line and certain system binaries is essential.

2. Download and install the latest Developer Tools for Mac OS X from
   http://developer.apple.com.	This package contains g++ and other
   development tools.

3. Download the latest version of Fink from http://fink.sourceforge.net.
   Follow the directions to install and set up your shell environment
   variables.  (Note the alternate directions for bash users.)

   Fink is similar to Debian's package manager and makes for easy install-
   ation of applications.  Anything installed with Fink will be placed
   in the /sw hierarchy by default -- this location is not conventional
   but should pose no problem for your Mumps installation.  (You may need
   to identify /sw as an additional directory to search for application
   includes and libraries.)

3. Install the Perl Compatible Regular Expression (PCRE) library using
   Fink.

     $ sudo fink install pcre

   At the time of this writing, a manual, non-Fink installation of PCRE
   failed on a two-level namespace error.  The following link describes
   the problem in greater detail.

     http://developer.apple.com/techpubs/macosx/ReleaseNotes/TwoLevelNamespaces.html

   We recommend using Fink to bypass this error.

4. Install the Berkeley Database using Fink.

     $ sudo fink install db4

   It is probable that you have a version of db already installed on your
   system.  Mumps requires a late version 3 or greater.	 If you install a
   newer version of db from the source files or using Fink, please note which
   directory you install into.	You will have to provide the path during
   the Mumps installation.

5. Install readline using Fink.

     $ sudo fink install readline

7. Download and decompress Mumps v5.11 or later.

     $ wget http://www.cs.uni.edu/~okane/source/mumpscompiler-5.11.src.tar.gz
     $ tar xvpfz mumpscompiler-5.10.src.tar.gz

8. Install Mumps.

     $ cd mumpsc
     $ ./configure --with-libraries=/sw/lib:/usr/local/pgsql/lib \
       --with-includes=/sw/include:/sw/include/db4:/usr/local/pgsql/lib
     $ make
     $ make install

   You can specify nonstandard locations of include and library files using
   the --with-includes and --with-libraries options.  If you installed all
   software using the directions provided here, the above example should
   work fine.  Consult INSTALL for more general Mumps installation inform-
   ation.  This will place the Mumps Compiler in ~/mumps_compiler.  The
   compilation script is in ~/mumps_compiler/bin/mumpsc.  To do a system
   install, login as root, and in step 8, add prefix=/usr to the command
   line arguments of the "configure" command.  This will insert the Mumps
   Compiler code in /usr/bin/mumpsc, /usr/include/mumpsc, and /usr/lib/mumpsc.


In March, July, October and May, the Ides fall on the fifteenth day.