Mumps 95 compliant pattern matching (the '?' operator) is implemented in this compiler as given by the following grammar:
pattern ::= {pattern_atom}
pattern_atom ::= count pattern_element
count ::= int | '.' | '.' int
| int '.' | int '.' int
pattern_element ::= pattern_code {pattern_code} | string | alternation
pattern_code ::= 'A' | 'C' | 'E' | 'L' | 'N' | 'P' | 'U'
alternation ::= '(' pattern_atom {',' pattern_atom} ')'
The largest difference between the current and previous standard is the introduction of the alternation construct, an extension that works as in other popular regular expressions implementations. It allows for one of many possible pattern fragments to match a given portion of subject text.
A string literal must be quoted. Also note that alternations are only allowed to contain pattern atoms and not full patterns; while this is a possible shortcoming, it is in accordance with the standard. It is a trivial matter to extend alternations to the ability to contain full patterns, and this may be implemented upon sufficient demand.
Pattern matching is supported by the Perl-Compatible Regular Expressions
library (PCRE). Mumps patterns are translated via a recursive-descent parser
in the Mumps library into a form consistent with Perl regular expressions,
where PCRE then does the actual work of matching. Internally, much of this
translation is simple character-level transliteration (substituting '|'
for the comma in alternation lists, for example). Pattern code sequences
are supported using the POSIX character classes supported in PCRE and are
mostly intuitive, with the possible exception of 'E', which
is substituted with [[:print][:cntrl:]]. Currently, this construct
should cover the ASCII 7-bit character set (lower ASCII).
Due to the heavy string-handling requirements of the pattern translation process, this module uses a separate set of string-handling functions built on top of the C standard string functions, using no dynamic memory allocation and fixed-length buffers for all operations whose length is given by the constant STR_MAX in sysparms.h. If an operation overflows during the execution of a Mumps compiled binary, a diagnostic is output to stderr and the program terminates. If such termination occurs too frequently, simply increase the value of STR_MAX.