Pattern matching
================

[[Parent]]: understanding.txt

_Pattern matching_ is the process of testing whether a given string belongs to a given set of strings, called a _pattern_. Remark uses two traditional forms of pattern matching, called globs and regular expressions.

Globs
-----

A _glob_ defines a pattern by a string in which:

  * a character matches itself, except that
  * `*` matches anything any number of times, 
  * `?` matches anything at most one time, 
  * `[seq]` matches any character in string `seq`, and 
  * `[!seq]` matches any character not in string `seq`.
 
For example, `?at.png` matches `at.png`, `bat.png`, and `cat.png`. Globs are commonly used in file systems. They capture a reasonable amount of patterns, while still being intuitive.

Regular expressions
-------------------

A _regular expression_, or a regex, defines a pattern by a string
constructed using the following kind of rules:

 * a character matches itself, except that
 * `.` matches any single character except the newline,
 * `E?` matches `E` at most one time,
 * `E*` matches `E` any number of times,
 * `E+` matches `E` at least one time,
 * `AB` matches `A` and `B` in a sequence,
 * `A|B` matches either `A` or `B`,
 * `(E)` matches `E`,

where `E`, `A`, and `B` are regular expressions. The backslash `\` is used to escape the meta-characters. This list is incomplete; the regular expressions are given using the [Python's regular expression syntax][PythonRegex]. For example, `(ab)*.txt` matches `.txt`, `ab.txt`, `abab.txt`, and so on. In Remark, the regex is automatically appended `\Z` at the end so that it must match the whole string. Regular expressions are strictly more powerful than globs, but they are also less intuitive.

[PythonRegex]: http://docs.python.org/library/re.html