Pattern matching
================

[[Parent]]: understanding.txt

_Pattern matching_ is the process of testing whether a given string
belongs to a given set of strings, called a _pattern_. Remark uses
two traditional forms of pattern matching, called globs and regular
expressions.

Globs
-----

A _glob_ defines a pattern by a string in which:

  * a character matches itself, except that
  * \* matches anything any number of times, 
  * ? matches anything at most one time, 
  * \[seq\] matches any character in string seq, and 
  * \[!seq\] matches any character not in string seq.
 
For example, ?at.png matches at.png, bat.png, and cat.png.
Globs are commonly used in file systems. They capture a reasonable
amount of patterns, while still being intuitive. 

Regular expressions
-------------------

A _regular expression_, or a regex, defines a pattern by a string
constructed using the following kind of rules:

 * a character matches itself, except that
 * . matches any single character except the newline,
 * E? matches E at most one time,
 * E* matches E any number of times,
 * E+ matches E at least one time,
 * AB matches A and B in a sequence,
 * A|B matches either A or B,
 * (E) matches E,

where E, A, and B are regular expressions. The backslash \\ is used
to escape the meta-characters. This list is
incomplete; the regular expressions are given using the 
[Python's regular expression syntax][PythonRegex]. For example, 
(ab)*\\.txt matches .txt, ab.txt, abab.txt, and so on. In Remark, the 
regex is automatically appended \\Z at the end so that it must match 
the whole string. Regular expressions are strictly more powerful than 
globs, but they are also less intuitive.

[PythonRegex]: http://docs.python.org/library/re.html