SNOBOL/SPITBOL Patterns for Lualibspipat Lua wrapperlspipatRobinHaberkornrobin.haberkorn at googlemail.com2010Robin Haberkorn
The following document is the lspipat
Lua 5.1 module documentation and reference.
Thanks To...lspipat would not be possible without:
PhilBudne, for spipat.
lspipat is merely a spipat wrapper.
RobertDewar who has created Macro SPITBOL and
the GNAT.Spitbol package.
spipat was derived from GNAT.Spitbol, which is based on Macro SPITBOL.
Introductionlspipat is a wrapper to spipat
that brings support for a first-class SNOBOL/SPITBOL-like pattern data type.
Patterns can be constructed and subsequently combined with other patterns,
strings, numbers and functions using binary and unary operators allowing
the construction of grammars describing any Context Free Language.
Patterns can be matched against any Lua string.
A major difference to other pattern matching techniques like regular expressions, besides
the supported language class, is the possibility to construct patterns/grammars in a
readable and intuitive way, somewhat reminiscent of the BNF.
They can include pattern elements that have side-effects (i.e. Lua code executed during
pattern matching) or produce and influence pattern elements dynamically.
For instance, functions can be specified that are executed during matching to produce
the parameters necessary for the interpretation of a pattern element.
Code can be embedded that generates entire patterns on the fly.
Matching previously matched substrings and implementing recursive patterns
is only one application of the powerful dynamic pattern elements traditionally
offered by SNOBOL pattern matching and thus by lspipat.
SNOBOL/SPITBOL pattern matching was traditionally used in compiler construction
and prototyping, artificial intelligence research and the humanities.
Resources
These internet resources are more or less directly related to lspipat and
might be useful to you:
http://luaforge.net/projects/lspipat/:
lspipat project page at LuaForge, downloads, bug tracker, etc.
http://www.snobol4.org/spipat/:
libspipat downloads
http://pypi.python.org/pypi/spipat/:
libspipat's Python wrapper (included in libspipat
packages).
http://www.infeig.unige.ch/support/ada/gnatlb/g-spipat.html:
GNAT.Spitbol description. Also installed as pattern.txt by lspipat.
ftp://ftp.cs.arizona.edu/snobol/gb.pdf:
The SNOBOL4 Programming Language (The famous Green Book)
ftp://ftp.snobol4.com/spitman.pdf:
Macro SPITBOL Reference Manual
other interesting resources compiled by Phil Budne...
Comparison with SNOBOL
Just as patterns in SNOBOL are combined and constructed dynamically with
binary and unary operators, lspipat also uses operators available in
Lua to construct patterns in a simple and intuitive way.
The operators and pattern-construction functions were chosen, so the pattern construction syntax
is as similar as possible to SNOBOL/SPITBOL.
The following table shows a comparision of operators between
SPITBOL and lspipat:
Comparision of SPITBOL and lspipat operatorsOperationSPITBOLlspipatNotesAlternation|+
Refer to .
Cannot be used to combine two strings.
Concatenation(space)*Immediate Assignment/Call$%% and / have the
same precedence
as * in Lua.
Also only call versions are supported (see ).
Deferred Assignment/Call./Cursor Assignment@(unary)#(unary)
Refer to .
lspipat only supports a call version
(see ).
SetcurDefer Expression*(unary)-(unary) or Pred
Refer to .
In general, expressions can be wrapped in (anonymous) functions to defer them.
Interrogation/Predicate?(unary)Pattern Match?smatch
Refer to .
S ? P is roughly equivalent to S:smatch(P) in Lua.
(space)Substring Replacement=ssub
Refer to .
S P = R is roughly equivalent to S:ssub(P, R, 1) in Lua.
Installationlspipat uses an autotools buildsystem. The standard
INSTALL file contains instructions on how to use it from
a package builder's perspective.
Nevertheless, there are some quirks that should be mentioned.
Dependencies
spipat 0.9.3+:
You are advised to apply the patch spipat-patches/0.9.3+_image.patch first
before building spipat, even though it is not mandatory.
It fixes a header file (so lspipat can make use of customized
render-to-string functionality) and various bugs.
Lua 5.1:
You probably have this already. The configure script
should be able to cope with Ubuntu and
Lua Binaries
distributions. The standalone Lua compiler is only required if
compilation of Lua scripts is enabled.
Configuration Options
The following special configure script options
are supported:
--enable-lua-libdir=DIR
Change the installation directory of lspipat.
It defaults to LIBDIR/lua/5.1. You probably want this to
point to some directory in Lua's
module search path, so the default should be ok.
--disable-lua-precompile
Disable precompilation of Lua source files.
Naturally, a Lua compiler will not be required when this option
is used.
--disable-lua-strip
Do not strip (i.e. remove debugging symbols from) compiled
Lua sources.
--disable-html-doc
Do not generate HTML documentation. The documentation is usually
derived from Docbook using
XSLTProc.
Disabling this may be useful if you have got some problem
with the tool chain but are satisfied with the precompiled
documentation in the distribution.
Furthermore, you should note that render-to-string results are not
reminiscent of lspipat syntax (used in this document) by default.
For lspipat to be able to customize these renderings,
configure has to find some spipat headers which
are not normally installed.
Therefore it is highly recommended to add spipat's source directory to the C include search path
using the CPPFLAGS variable before running configure.
Thus, supposing that spipat sources are located in your home directory,
the most common way to install lspipat would be:
Usage
After lspipat has been installed properly, you will
be able use it in your Lua program by simply requiring lspipat
(i.e. require "lspipat").
The module table will be called spipat, but many functions
(especially pattern constructors) will be registered as globals as well.
Also, some operators will be overloaded.
For details on all that (operators, globals, etc.) refer to
.
Examples
The samples directory in the lspipat source package
contains some small examples that I hope give you some inspiration on how and where to use
lspipat.
samples/exp2bf.luaexp2bf.luaexpression
Compiles simple arithmetic expressions to Brainfuck programs that when
executed evaluate the expression and print the result
(8-bit unsigned integer arithmetics).
Prints these programs to stdout.
Use that for whatever you can imagine ;-)
samples/wave.luawave.luawavefile
Validates/parses WAV files
and prints some information about it.
This is an example of how to use lspipat
to do pattern matching on "binary" data (formats, protocols). Some
primitives were implemented in Lua for that reason - in the future
there might be a separate C-module to do the encoding/decoding of
integers in different byte-orders more efficiently.
samples/regexp.lua
Small regular expression example/test - uses a comprehensive regular
expression describing IPs.
Variable Deferring Techniques
In SNOBOL, arbitrary expressions could be deferred
(i.e. their evaluation could be deferred) by using the unary asterisk operator.
With lspipat however, you will have to pass functions
(which can be constructed anonymously) to the appropriate constructors to achieve
the same goal.
Deferring expressions which should be combined with other patterns is one
application of the Pred constructor
and - operator respectively.
Deferring variables is just a special case of deferring expressions.
In this chapter, different ways of optimizing variable deferrings will be
explained using a simple example.
For instance if you would like to assign a
matched quotation character to a local variable and use that to subsequently match
a simple quote/string, you could use function closures to write something like that:
Function Closures for Deferring Purposeslocal cquote
string = Any("\"'") / function(c) cquote = c end
* Break(function() return cquote end)
* -function() return cquote end
You may find this solution a bit verbose, compared with
SNOBOL's elegant syntax.
To save some typing you could define your own constructors
that take the name of a global variable (as a string)
and construct patterns whose arguments are retrieved by
a function closure accessing the globals table.
Custom Constructors for Deferring Purposesfunction _Break(name)
return Break(function() return _G[name] end)
end
function _Pred(name)
return -function() return _G[name] end
end
string = Any("\"'") / function(c) cquote = c end
* _Break "cquote"
* _Pred "cquote"
Of course, if you do not want to pollute the global namespace
your custom functions could just as well access a local table.
Furthermore, you could optimize the code by defining one generic
table access function which is suitable to be used for
lspipat's pattern constructors -
being able to pass so called cookies
to functions comes in handy.
Generic Retrievers for Deferring Purposesfunction getGlobal(name) return _G[name] end
function _Break(name) return Break(getGlobal, name) end
function _Pred(name) return Pred(getGlobal, name) end
-- ...
Fortunately, lspipat already defines
such constructors (deferring global variables) for you.
Whereever possible, there will be versions of constructors
with leading underscores that work similar to the ones in
the example above.
You can of course overwrite these constructors, e.g. with
versions accessing a special local table.
Recursive Patterns
Recursive patterns can be implemented just as described above.
Supposing you want to match the repetition of the predefined pattern
P (greedy) you could write
something like that:
Recursive Patterns
Sometimes however when using global variables is inappropriate,
you might want to do the following trick:
Recursive Pattern Trick
It works because foo is still a function in the scope
of the assignment's right side, but a pattern afterwards so the
function - to which no (direct) reference exists anymore - will return
the pattern foo after the assignment.
Module Reference
A compilation of all functions in the lspipat
module, global functions registered by the module, methods
and overloaded operators follows.
smatchsmatchPerform pattern match on a subject stringspipat.smatch
( subject, pattern, flags )
subject:smatch
( pattern, flags )
Description
Tries to match pattern against subject
using the given flags.
Parameterssubject (string): A string against which the pattern match will be performedpattern (userdata): The pattern used for matching
flags (number or nil):
Optional spipat flags.
Spipat Flags
Flags are added (e.g. spipat.match_anchored + spipat.match_debug),
due to the lack of a logical/binary or operator in Lua.
spipat.match_anchored: Match in anchored mode
spipat.match_debug:
Match with progress being printed to stdout.
Useful for pattern debugging as the name suggests.
Return Values
In case of an exception during matching, raises an error.
In case no substring matches, returns a single nil value.
Otherwise returns
number: Start of matched substringnumber: End of matched substringssubssubSubstitute substrings matching a pattern in a subjectspipat.ssub
( subject, pattern, replacement, n, flags )
subject:ssub
( pattern, replacement, n, flags )
Description
Substitutes regions in subject matching pattern either with a string
if replacement is a string or if replacement is a function, the result
of calling that function. This may be useful for deferring the evaluation of replacement strings
which depend on (are built from) results of the matching process (e.g. call-on-match or call-immediately function executions).
Parameterssubject (string): The subject for the first pattern matchpattern (userdata): The pattern used for matching
replacement (string or function):
Replacement string or a function that's executed after matching to produce the replacement string
n (number or nil):
Optional maximal number of match/replacement operations. The first match
is performed on subject, subsequent matches on the result of the preceding
replacements. Naturally replacement stops when the pattern does not match anymore.
If n is absent or nil, replacement only stops when pattern
does not match anymore.
flags (number or nil):
Optional spipat flags, as in .
Return Values
In case of an exception during matching, raises an error.
Otherwise returns
string: The result of the last replacement performed or the original
subject if no substring matched at all
number: The number of match/replacement operations actually performed
ExampleReplacements with spipat.ssub> print(spipat.ssub("abc ccC bab", Span("abc") / function(s) str = s end, function() return "["..str:upper().."]" end, 2))
[ABC] [CC]C BaB
>sitersiterReturn iterator of substrings matching a pattern in a subjectspipat.siter
( subject, pattern, flags )
subject:siter
( pattern, flags )
Description
Returns an iterator function performing a pattern match on subject
and returning the matched substring (start/end positions in subject).
Each time it is called, it begins matching where the last substring ended, but using the same
subject.
Parameterssubject (string): The subject used for pattern matching
pattern (userdata): The pattern used for matching.
Naturally, anchoring the pattern using any of the possible methods is nonsense.
flags (number or nil):
Optional spipat flags, as in .
Return Values
In case of an exception during matching, raises an error.
Otherwise returns
function: The iterator function. Calling it returns
number: Start of matched substringnumber: End of matched substringExampleIterating through substrings with spipat.siter> str = "abc"
> for s, e in str:siter(Len(1)) do print(str:sub(s, e)) end
a
b
c
>freefreeFinalize patternspipat.free( pattern )
pattern:free()
Description
Finalizes pattern, i.e. frees memory associated with it and unreferences any
other Lua values (other patterns, functions, etc.) so they can get garbage collected.
Finalizing an already finalized pattern does nothing.
Using a finalized pattern in any function or operator working with a pattern
will raise an error.
free does early what would otherwise be done when the pattern is garbage
collected, so in most cases you will not need it at all.
It may be useful when you would like to free a large pattern you do not need anymore but
removing all references to that pattern and enforcing a full garbage collection cycle
is not feasible.
Parameters
pattern (userdata): The pattern to be finalized
Return Values
Returns nothing.
ExampleFinalizing a pattern> p = Arb()
> p:free()
> print(p * "foo")
stdin:1: Pattern already freed
>ConversiontopatternConvert a value to a patterntostringRender a pattern as a stringspipat.topattern( value )
topattern( value )
value:topattern()
tostring( pattern )
Descriptiontopattern creates a pattern for a string or number, matching that string or number.
If value is already a pattern it returns that pattern without modification.
In case of an unsupported value type or miscelleaneous error, topattern always
returns nil.
topattern is useful to explicitly create pattern, e.g. when an operator requires
at least one operand to be a pattern but both are strings, numbers or functions.
Lua's built-in tostring
function called on a pattern renders that pattern as a string reminiscent of
lspipat's pattern construction syntax.
ExampleExplicit pattern construction & implicit conversion to strings print("2" + 3)
5
> print(topattern("2") + 3)
("2" + "3")
>]]>dumpdumpDump a pattern to stdoutspipat.dump( pattern )
Descriptiondump prints information about a pattern to
stdout.
The kind of information displayed is similar to
tostring's rendering.
It is useful for debugging purposes.
Parameters
pattern (userdata): The pattern to be dumped
Return Values
Returns nothing.
Concatenation and Alternation*Concatenate patterns+Alternate patternspattern*valuevalue*patternpattern*patternpattern+valuevalue+patternpattern+patternDescription
The * operator constructs a concatenation of two values
if at least one of them is a pattern and returns the result as a pattern.
A concatenation matches the left operand immediately followed by the right operand.
The + operator constructs an alternation between two values
if at least one of them is a pattern and returns the result as a pattern.
An alternation matches the left operand and if unsuccessful the right operand.
The non-pattern values may be strings or numbers, which are matched
just like a pattern built by
topattern.
Even though the patterns participating in the composition will be copied,
references will be kept, so they will not be garbage collected until all patterns
using them are garbage collected.
Return Valuespattern (userdata): Result of the pattern composition
ExampleConcatenations and Alternations> pat = (topattern("ABC") + "AB") * (topattern("DEF") + "CDE") * (topattern("GH") + "IJ")
> assert(spipat.smatch("ABCCDEGH", pat))
> assert(spipat.smatch("ABCDEFIJ", pat))
>Assignment Calls%Call Immediately/Deferred Callpattern%functionpattern/functionDescription
The % operator constructs a pattern matching operand pattern and
calling a Lua function whenever pattern matches during a pattern
match (i.e. function may be called more than once while matching regardless of whether
the match fails or succeeds).
On the other hand, the / operator constructs a pattern matching operand
pattern and calling a Lua function at most once - only if
the match succeeds.
In both cases, function receives the following arguments when called:
string: The substring matched by pattern
Its return value is ignored.
Unlike assignment operators in SNOBOL, the % and /
operators in Lua have the same precedence
as the concatenation operator *,
so using parentheses is advised.
Deferred assignments (assign on match & assign immediately) are not directly possible but can be
easily implemented using function closures as described in .
Even though the pattern operands will be copied, references will be kept,
so they will not be garbage collected until all patterns
using them are garbage collected.
Furthermore, references to functions will be kept so they will not be
garbage collected until the patterns constructed by the operators are garbage collected.
Return Valuespattern (userdata): Pattern built by the operators
Example
See .
Cursor Assignment CallsSetcurCursor Assignmentspipat.Setcur
( function, cookie )
Setcur
( function, cookie )
#functionspipat._Setcur( string )
_Setcur( string )
DescriptionSetcur is a pattern constructor returning a pattern matching the null string ""
(i.e. always succeeds when matched) and immediately calling a Lua function when matched.
This function receives the following arguments when called:
number: The cursor in the subject string.
In other words, the number of characters matched so far from the beginning of the subject string.
cookie: Any Lua value specified as a cookie in the pattern constructor or
nil if no cookie was specified.
Its return value is ignored.
The unary # operator is equivalent to the Setcur constructor with no
cookie specified.
_Setcur is similar to Setcur but actually assigns the cursor position to
the global variable whose name is specified by a string value.
This means that _Setcur(str) does not assign the cursor position to the global variable str
but rather to the variable with the name str contains, e.g. foo if str == "foo".
So generally _Setcur is equivalent to:
In a similar manner, other kinds of deferred assignments can be implemented
using function closures as described in .
References to function and cookie will be kept so they will not be
garbage collected until the pattern constructed by Setcur is garbage collected.
Return Valuespattern (userdata): Pattern built by the constructor
PredicatesPredPredicate Constructorspipat.Pred
( function, cookie )
Pred
( function, cookie )
-functionspipat._Pred( string )
_Pred( string )
-stringDescriptionPred constructs a pattern which allows you to transparently define its matching behaviour
using a function called when this pattern is attempted to be matched.
It receives the following arguments when invoked:
cookie: Any Lua value specified as a cookie in the pattern constructor or
nil if no cookie was specified.
The function's return value defines the behaviour dynamically, as shown in the following table:
Dynamic Function Return ValuesValueTypeBehaviournilnil
Match the "" string, i.e. succeed.
truebooleanfalse
Pattern match fails, like when using the
Fail primitive.
any number
Try to match that number as a string, as if
converted to a pattern.
any string
Try to match that string, as if
converted to a pattern.
any pattern
Try to match that pattern. Returning a pattern assigned to a variable is the way
to implement recursive patterns.
The unary - operator applied to a function is equivalent
to the Pred constructor with no cookie specified.
_Pred is similar to Pred but actually gets the Lua value defining its behaviour from
the global variable whose name is specified by a string value.
This means that _Pred(str) does not get the value from the global variable str
but rather from the variable with the name str contains, e.g. foo if str == "foo".
So generally _Pred is equivalent to:
In a similar manner, other kinds of variable deferring as well as recursive patterns can be implemented
using function closures as described in .
The unary - operator applied to a string which is not convertable to
a number is equivalent to the _Pred constructor - naturally this
should be true for all global variable names.
This constraint comes from the way Lua handles operations by default (it checks whether it is an arithmetic operation
before evaluating any metamethod - see metatables).
References to function and cookie will be kept so they will not be
garbage collected until the pattern constructed by Pred is garbage collected.
Return Valuespattern (userdata): Pattern built by the constructor
String PrimitivesAnyMatch any character in a setNotAnyMatch any character not in a setBreakMatch characters up to a break characterBreakXMatch characters up to a break character (extending)NSpanMatch nothing or characters from a setSpanMatch characters from a setspipat.Any( set )
spipat.Any
( function, cookie )
spipat._Any( string )
spipat.NotAny( set )
spipat.NotAny
( function, cookie )
spipat._NotAny( string )
spipat.Break( set )
spipat.Break
( function, cookie )
spipat._Break( string )
spipat.BreakX( set )
spipat.BreakX
( function, cookie )
spipat._BreakX( string )
spipat.NSpan( set )
spipat.NSpan
( function, cookie )
spipat._NSpan( string )
spipat.Span( set )
spipat.Span
( function, cookie )
spipat._Span( string )
Description
String primitives are pattern constructors that in their first form all take a string or
number (which is converted to a string) as their sole argument
(set).
In their second form they take a Lua function and an optional cookie
as arguments. When the constructed pattern is about to be matched, the function is called
and is supposed to return a string or number (which is converted to
a string) to supply the primitive's argument dynamically.
It receives the following arguments when invoked:
cookie: Any Lua value specified as a cookie in the pattern constructor or
nil if no cookie was specified.
The primitives with a leading underscore (e.g. _Any) are similar but actually get their argument
from a global variable with the name a string argument contains.
This means that for instance _Any(str) does not get its character set from the global variable str
but rather from the variable with the name str contains, e.g. foo if str == "foo".
So generally _Any is equivalent to:
In a similar manner, other kinds of variable deferring can be implemented
using function closures as described in .
References to function and cookie will be kept so they will not be
garbage collected until the pattern constructed is garbage collected.
The following table describes what these primitives do:
String PrimitivesPrimitiveDescriptionAny( S )
Where S is a string, matches a single character that is
any one of the characters in S. Fails if the current
character is not one of the given set of characters.
NotAny( S )
Where S is a string, matches a single character that is
not one of the characters of S. Fails if the current
characer is one of the given set of characters.
Break( S )
Where S is a string, matches a string of zero or more
characters up to but not including a break character
that is one of the characters given in the string S.
Can match the null string, but cannot match the last
character in the string, since a break character is
required to be present.
BreakX( S )
Where S is a string, behaves exactly like Break(S) when
it first matches, but if a string is successfully matched,
then a susequent failure causes an attempt to extend the
matched string.
NSpan( S )
Where S is a string, matches a string of zero or more
characters that is among the characters given in the
string. Always matches the longest possible such string.
Always succeeds, since it can match the null string.
Span( S )
Where S is a string, matches a string of one or more
characters that is among the characters given in the
string. Always matches the longest possible such string.
Fails if the current character is not one of the given
set of characters.
Return Valuespattern (userdata): Pattern built by the constructor
ArbnoArbnoMatches a pattern any number of timesspipat.Arbno( P )
Arbno( P )
Description
Where P is any pattern, matches any number of instances
of the pattern, starting with zero occurrences. It is
thus equivalent to ("" + (P * ("" + (P * ("" ....)))).
The pattern P may contain any number of pattern elements
including the use of alternation and concatenation.
Arbno is a pattern constructor taking exactly one argument which is
either a pattern or string (which is treated
like it is converted to a pattern first).
A reference to P will be kept if it is a pattern
so it will not be garbage collected until the pattern constructed is garbage collected.
Return Valuespattern (userdata): Pattern built by ArbnoFenceFenceAbort match when alternations are soughtspipat.Fence( P )
Fence( P )
DescriptionFence is a pattern constructor taking no or exactly one
pattern as an argument.
A reference to pattern P will be kept so it will not
be garbage collected until the pattern constructed is garbage collected.
The following table describes what the two versions do:
Fence PrimitivePrimitiveDescriptionFence()
Matches the null string at first, and then if a failure
causes alternatives to be sought, aborts the match (like
a Cancel). Note that using Fence at the
start of a pattern has the same effect as matching in anchored mode.
Fence( P )
Where P is a pattern, attempts to match the pattern P
including trying all possible alternatives of P. If none
of these alternatives succeeds, then the Fence pattern
fails. If one alternative succeeds, then the pattern
match proceeds, but on a subsequent failure, no attempt
is made to search for alternative matches of P. The
pattern P may contain any number of pattern elements
including the use of alternatiion and concatenation.
Return Valuespattern (userdata): Pattern built by FenceInteger PrimitivesLenMatch a number of charactersPosMatch null string if number of characters have been matchedRPosMatch null string if number of characters remain to be matchedTabMatch characters until number of characters have been matchedRTabMatch characters until number of characters remain to be matchedspipat.Len( n )
spipat.Len
( function, cookie )
spipat._Len( string )
spipat.Pos( n )
spipat.Pos
( function, cookie )
spipat._Pos( string )
spipat.RPos( n )
spipat.RPos
( function, cookie )
spipat._RPos( string )
spipat.Tab( n )
spipat.Tab
( function, cookie )
spipat._Tab( string )
spipat.RTab( n )
spipat.RTab
( function, cookie )
spipat._RTab( string )
Description
Integer primitives are pattern constructors that in their first form all take a number or
string (which is converted to a number) as their sole argument
(n).
This number has to be an unsigned integer - sometimes a natural number depending on the
primitive.
If the argument is ommitted, zero is assumed.
In their second form the primitives take a Lua function and an optional cookie
as arguments. When the constructed pattern is about to be matched, the function is called
and is supposed to return a number or string (which is converted to
a number) to supply the primitive's argument dynamically.
It receives the following arguments when invoked:
cookie: Any Lua value specified as a cookie in the pattern constructor or
nil if no cookie was specified.
The primitives with a leading underscore (e.g. _Len) are similar but actually get their argument
from a global variable with the name a string argument contains.
This means that for instance _Len(str) does not get its argument from the global variable str
but rather from the variable with the name str contains, e.g. foo if str == "foo".
So generally _Len is equivalent to:
In a similar manner, other kinds of variable deferring can be implemented
using function closures as described in .
References to function and cookie will be kept so they will not be
garbage collected until the pattern constructed is garbage collected.
The following table describes what these primitives do:
Integer PrimitivesPrimitiveDescriptionLen( N )
Where N is a natural number, matches the given number of
characters. For example, Len(10) matches any string that
is exactly ten characters long.
Pos( N )
Where N is a natural number, matches the null string
if exactly N characters have been matched so far, and
otherwise fails.
RPos( N )
Where N is a natural number, matches the null string
if exactly N characters remain to be matched, and
otherwise fails.
Tab( N )
Where N is a natural number, matches characters from
the current position until exactly N characters have
been matched in all. Fails if more than N characters
have already been matched.
RTab( N )
Where N is a natural number, matches characters from
the current position until exactly N characters remain
to be matched in the string. Fails if fewer than N
unmatched characters remain in the string.
Return Valuespattern (userdata): Pattern built by the constructor
Miscelleanous PrimitivesArbMatches any stringBalMatches parentheses balanced stringsAbortImmediately abort pattern matchFailNull alternationRemMatch the entire remaining subject stringSucceedMatch the null string in every alternativespipat.Arb()
Arb()
spipat.Bal()
Bal()
spipat.Abort()
Abort()
spipat.Fail()
Fail()
spipat.Rem()
Rem()
spipat.Succeed()
Succeed()
Description
These are simple pattern constructor
functions.
The following table describes what these primitives do:
Miscelleanous PrimitivesPrimitiveDescriptionArb()
Matches any string. First it matches the null string, and
then on a subsequent failure, matches one character, and
then two characters, and so on. It only fails if the
entire remaining string is matched.
Bal()
Matches a non-empty string that is parentheses balanced
with respect to ordinary () characters.
Examples of balanced strings are "ABC",
"A((B)C)", and "A(B)C(D)E".
Bal matches the shortest possible balanced
string on the first attempt, and if there is a subsequent failure,
attempts to extend the string.
Abort()
Immediately aborts the entire pattern match, signalling
failure. This is a specialized pattern element, which is
useful in conjunction with some of the special pattern
elements that have side effects.
Fail()
The null alternation. Matches no possible strings, so it
always signals failure. This is a specialized pattern
element, which is useful in conjunction with some of the
special pattern elements that have side effects.
Rem()
Matches from the current point to the last character in
the string. This is a specialized pattern element, which
is useful in conjunction with some of the special pattern
elements that have side effects.
Succeed()
Repeatedly matches the null string (it is equivalent to
the alternation ("" + "" + "" ....). This is a special
pattern element, which is useful in conjunction with some
of the special pattern elements that have side effects.
Return Valuespattern (userdata): Pattern built by the constructor
POSIX Extended Regular ExpressionsRegExpMatches a pattern equivalent to a regular expressionspipat.RegExp
( expression, captures )
RegExp
( expression, captures )
DescriptionRegExp constructs from a
POSIX Extended Regular Expression, a pattern that is equivalent to that regular
expression and can be combined with other patterns freely.
It can optionally construct the pattern to save the captures
from a regular expression match in a Lua table.
Even though this implementation should support almost all elements of EREs,
it is considered experimental.
You are advised to use the usual pattern construction primitives.
Parameters
expression (string): The POSIX ERE which is compiled
to a pattern.
captures (table): Optional table, or more precisely
array, to hold subexpression captures.
Naturally, it has to exist when RegExp is called.
When a subexpression is captured (i.e. the pattern equivalent to what is
enclosed in parentheses), the matching string is added to the
end of the table.
Thus taken that captures is initially empty, if
RegExp("(a(b))", captures) matches, captures
will be {"b", "ab"}.
Return Valuespattern (userdata): Pattern built by RegExpExampleRegular Expressions print(RegExp "^[[:digit:]]*?(abc\\.|de?)")
Pos(0) * Arbno(Any()) * ("abc." + "d" * ("" + "e"))
>]]>