{ "info": { "author": "Matthew Hawthorn", "author_email": "hawthorn.matthew@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: Apache Software License", "Operating System :: MacOS :: MacOS X", "Operating System :: POSIX :: Linux", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Topic :: Software Development :: Code Generators", "Topic :: Software Development :: Libraries :: Python Modules", "Topic :: Text Processing", "Topic :: Utilities", "Typing :: Typed" ], "description": "# Regular expressions made readable\n\n## Introduction\n\n`bourbaki.regex` provides an interface for constructing arbitrarily complex \nregular expressions using standard Python syntax.\n\nThe goals of the package are the following:\n\n - allow the user to be as terse as possible while not sacrificing readability\n\n - support the full range of constructs available in the standard library regex engine (`re` module)\n\n - be extensible and modular to support more advanced constructs in the future, \n as for instance provided by the [`regex`](https://pypi.org/project/regex/) module\n\n - treat python string literals as literal strings to be matched wherever possible, obviating the need for special \n constructors\n\n - handle tedious minutiae such escaping special characters in literals and inferring the correct group index for \n backreferences, allowing the user to specify them as literal references to previously constructed \n `bourbaki.regex.Regex` objects\n\n - raise meaningful errors at compile time, such as named group collisions, nonexistent\n backreferences, and lookbehind assertions that are not fixed-length\n\n\n#### Basic regex constructors\n\nThere are a few base constructors from which all expression patterns can be built.\nEach of them, and all expressions involving them in the sections below, result in instances of `bourbaki.regex.Regex`,\nwhich has the usual methods of a compiled python regex - `.match`, `.search`, `.fullmatch`, `.findall`, `.finditer`, \n`.sub`, `.subn`, `.split` - as well as the attribute `.pattern`.\n\nTo compile a pattern with regex flags (i.e. `re.IGNORECASE`), pass them to the `.compile` method.\nThe result will be a usual python regex.\n\n - `bourbaki.regex.C`: Character class constructor.\n `C['a':'z', 'A-Z', '0':'9']` for instance is equivalent to the raw regular expression `r'[a-zA-Z0-9]`\n\n - `bourbaki.regex.L/Literal`: Literal string match. This handles escaping special characters that are reserved for \n regular expression syntax.\n For example `L('*foo[bar]*')` is equivalent to the raw regular expression `r'\\*foo\\[bar\\]\\*'`\n (note the '\\' escapes)\n\n - `bourbaki.regex.If`: for construction of conditional patterns.\n For example, \n ```python\n foo = L(\"foo\")\n bar = L(\"bar\")\n foobar = foo.optional + If(foo).then_(bar).else_(\"baz\")\n ```\n `foobar` will now match `\"foobar\"` or `\"baz\"`, but not `\"foo\"`, since the pattern requires `\"bar\"` to follow \n when `\"foo\"` is matched.\n\n - Special symbols, including:\n `START, END, ANYCHAR, StartString, EndString, Tab, Endline, BackSpace, CarriageReturn`\n `WordBoundary, WordInternal, WordChar, NonWordChar, Digit, NonDigit, Whitespace, NonWhitespace`,\n which are self-describing.\n\n\nAll other kinds of pattern can be constructed by the use of operators, method calls, or attribute accesses on previously \nconstructed patterns, as detailed below.\n\n\n#### Repetition\n\nThe `*` (multiplication) operator expresses a fixed number of repetitions of a pattern.\nThis deviates from raw regex syntax but the multiplication operator matches python string semantics.\n\nThe `[]` (`__getitem__`) construct is also used to express repetition over a range of copies.\nThis construct closely resembles its raw regex curly-brace counterpart, while adding some functionality \nand matching the python slice semantics in expressing numeric ranges (though the upper bound is always included, as in \nraw regex).\n\nCommon repetition requirements are expressible via the `.one_or_more`, `.zero_or_more`, and `.optional` attributes.\n\n - `L(\"foo\") * 3` will match `\"foofoofoo\"`.\n\n - `L(\"foo\")[1:2]` will match `\"foo\"` or `\"foofoo\"`.\n\n - `L(\"foo\")[:]` is equivalent to `L(\"foo\").zero_or_more` and matches any number of copies of \"foo\", including \n the empty string.\n\n - `L(\"foo\")[1:]` is equivalent to `L(\"foo\").one_or_more` and matches any number of copies of \"foo\", requiring at \n least one.\n\n - `L(\"foo\")[:1]` is equivalent to `L(\"foo\").optional` and matches `\"foo\"` or the empty string.\n\n - `L(\"foo\")[1:5:2]` will match 1, 3 or 5 copies of `\"foo\"` (note that this makes what would otherwise be a somewhat \n complex regex very simple.\n\n\n#### Alternation\n\nThe `|` (pipe/bitwise or) operator is used to express alternation, as it is in raw regex.\n\n - `L(\"foo\") | \"bar\"` will match `\"foo\"` or `\"bar\"`.\n\n - When both sides of the operator are character classes, the pipe operator results in another character class matching \n the union of thier contents. This is semantically the same as alternation in a regex, but results in a more concise \n compiled pattern. For example, `C['a-z'] | C['0':'9']` compiles to the pattern `'[a-z0-9]'` rather than `'[a-z]|[0-9]'`\n\n\n#### Concatenation\n\nThe binary `+` (addition) operator is used to express concatenation of patterns.\nThis breaks with raw regex syntax (where concatenation is implicit in adjacent patterns) but captures the usual python \nstring semantics.\n\n - `L(\"foo\") + \"bar\"` will match `\"foobar\"` (note that the raw string `\"bar\"` is taken implicitly as a literal).\n\n\n#### Capture groups\n\n`bourbaki.regex` will only construct capture groups when explicitly asked to.\nFunction call syntax may be used to create capture groups, taking a single string argument as the name\n(motivated by the mnemonic that a group is _called_ by a name).\nAlternately, omitting the name results in an unnamed capture group, i.e. in raw regex we put parentheses on either side \nof a pattern to indicate capture, and in `bourbaki.regex`, we place an empty pair at the end of a pattern. \nThe `.as_` method and `.captured` attribute may also be used for this purpose.\n\n - `C['0':'9'].as_(\"a_numeral\")` will result in a regex matching a single digit for whose matches calling the \n `.groupdict()` method will yield a `dict` with the key `\"a_numeral\"`, i.e. this is a named group.\n This is equivalent to `C['0':'9'](\"a_numeral\")`, using the function call syntax.\n\n - `C['0':'9'].captured` is as above, but the group isn't named. It will get a number in the resulting compiled\n pattern, and will be accessible by calling `.groups()` on any match object.\n This is equivalent to `C['0':'9']()`, using the function call syntax.\n\n\n#### Lookahead and Lookbehind assertions\n\nLookahead and lookbehind assertions can be constructed with the `>>` and `<<` operators respectively.\nThe pattern which is matched \"points to\" the lookahead or lookbehind assertion.\n\nThe `-` _unary_ operator (negation) is used to express a _negative assertion_.\n\n - For example, `L(\"foo\") >> \"bar\"` will match `\"foo\"`, but only in a string where it is followed by `\"bar'`\n\n - Similarly, `L(\"foo\") << \"bar\"` will match `\"bar\"`, but only in a string where it is preceded by `\"foo\"`.\n\n - `\"foo\" >> -L(\"bar\")` will match `\"foo\"`, but only if _not_ followed by `\"bar\"`.\n\n - `-L(\"foo\") << \"bar\"` will match `\"bar\"`, but only if _not_ preceded by `\"foo\"`.\n\n\n#### Comments in compiled patterns\n\nThe `//` operator may take a raw string on the right which serves as a comment.\nIt has no effect on the match behavior of the resulting pattern but will be present as a comment in the \ncompiled pattern.\n\n - `L(\"foo\") // \"foo, the usual placeholder name\"` compiles to the pattern \n `\"(?#foo, the usual placeholder name)foo\"`\n\n\nNote:\nPython's standard library regex engine does not support variable-length lookbehind assertions. \nIf you attempt to use a pattern which matches variable-length strings as a lookbehind assertion, you will get a useful error.\nTo use `regex` or another conforming to python's `re` module API, but supporting variable length lookbehind \nassertions, simply `import bourbaki.regex.base as bre`, and set \n`bre.REQUIRE_FIX_LEN_LOOKBEHIND = False; bre.re = `.\n\n\n#### Atomic groups\n\nThe unary `+` (positive) operator is used to construct [_atomic groups_](https://www.regular-expressions.info/atomic.html).\nThis means that once the regex engine matches the atomic pattern, it will never backtrack to before the match.\nThe `.atomic` attribute may also be used.\n\nPython's standard library regex engine does not support this feature natively, but other modules such as \n[`regex`](https://pypi.org/project/regex/) do.\nThus, by default, `bourbaki.regex` constructs a standard python regex which _behaves_ as if it were atomic by using a \nbackreference to accomplish the same goal.\nTo use `regex` or another conforming to python's `re` module API, but supporting \natomic groups natively, simply `import bourbaki.regex.base as bre`, and set \n`bre.ATOMIC_GROUP_SUPPORT = True; bre.re = `.\n\n - For example, `\"a\" + (L(\"bc\") | \"b\").atomic + \"c\"` will match `\"abcc\"` but not `\"abc\"`, since both strings cause the \n atomic group in the middle of the pattern to be consumed as soon as `\"bc\"` is matched, leaving a `\"c\"` still to be \n matched.\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/bourbaki-py/regex", "keywords": "", "license": "Apache License 2.0", "maintainer": "", "maintainer_email": "", "name": "bourbaki.regex", "package_url": "https://pypi.org/project/bourbaki.regex/", "platform": "", "project_url": "https://pypi.org/project/bourbaki.regex/", "project_urls": { "Homepage": "https://github.com/bourbaki-py/regex" }, "release_url": "https://pypi.org/project/bourbaki.regex/0.1.7/", "requires_dist": null, "requires_python": "", "summary": "", "version": "0.1.7" }, "last_serial": 5652657, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "b9e1d3463c3272a3bcad600fefce81dd", "sha256": "2a1387596591b53ca8ac275ca95e627558d7a28ad9f20a5abe72355cf553e7c8" }, "downloads": -1, "filename": "bourbaki.regex-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "b9e1d3463c3272a3bcad600fefce81dd", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 15107, "upload_time": "2019-07-21T19:12:31", "url": "https://files.pythonhosted.org/packages/3e/d6/a308e30b0d7d7ea09ee004d635ee130e4e50097e199248ba610f6c94fc7c/bourbaki.regex-0.1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "501d2f668dff65c1744d106e433c5596", "sha256": "246c443a2ca014e6d2dda0b86fa56083d1f1b90c839799380436330860bd20be" }, "downloads": -1, "filename": "bourbaki.regex-0.1.0.tar.gz", "has_sig": false, "md5_digest": "501d2f668dff65c1744d106e433c5596", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10794, "upload_time": "2019-07-21T19:12:33", "url": "https://files.pythonhosted.org/packages/4b/24/d79c02d92a60920d1f64f1c55cfdb75273f0146178fdf88918c53a6e1419/bourbaki.regex-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "91fa9aef572132c1d7e22d894d467813", "sha256": "167440be8c6695e1b2890c9e81472b1b6fa014066e8259940e9a0617f64d4a47" }, "downloads": -1, "filename": "bourbaki.regex-0.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "91fa9aef572132c1d7e22d894d467813", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 16333, "upload_time": "2019-07-22T01:44:01", "url": "https://files.pythonhosted.org/packages/01/f4/0bfe4c188c1d97675a05e68c2ebd12c7b3369817769b21e008f3defc53b7/bourbaki.regex-0.1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "84a91b6749508a4d73ba53c4e68c40f1", "sha256": "0b4cb25d86e9c863f25410538769fcb2bfa6ac345841b18463e5cbc5b66b8140" }, "downloads": -1, "filename": "bourbaki.regex-0.1.1.tar.gz", "has_sig": false, "md5_digest": "84a91b6749508a4d73ba53c4e68c40f1", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 13093, "upload_time": "2019-07-22T01:44:03", "url": "https://files.pythonhosted.org/packages/53/75/3b22c5ac9bfeadca853a7dd50b72b9497c3ac4eb6ad69705c485b4a3fe5b/bourbaki.regex-0.1.1.tar.gz" } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "8d55162702ed0b6d0b47fb3ddfee3124", "sha256": "8935755449e36ae69d505cccd836001a6817b8b9d363e9346239919a6debf0f3" }, "downloads": -1, "filename": "bourbaki.regex-0.1.2-py3-none-any.whl", "has_sig": false, "md5_digest": "8d55162702ed0b6d0b47fb3ddfee3124", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 18095, "upload_time": "2019-07-24T12:31:57", "url": "https://files.pythonhosted.org/packages/51/45/5e2d5e403db5fc39667f23d52682206895f959c62d5a80b914e003daacea/bourbaki.regex-0.1.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "12603624b87eb25a5de2f81f10c73398", "sha256": "d56b3047f0bb3f1c96015c34229d8fe09b323a0ba23ae572413b19d2c7bbfcd5" }, "downloads": -1, "filename": "bourbaki.regex-0.1.2.tar.gz", "has_sig": false, "md5_digest": "12603624b87eb25a5de2f81f10c73398", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 16713, "upload_time": "2019-07-24T12:31:59", "url": "https://files.pythonhosted.org/packages/bd/79/b2669f6947072fc8d7f85c2112baef5283b9d18bdba13f5c3c0961c70da2/bourbaki.regex-0.1.2.tar.gz" } ], "0.1.4": [ { "comment_text": "", "digests": { "md5": "bda3ee201695a168acf76b31ec8d7679", "sha256": "1fee445fd5da220f2cf8abf51558343cc18a0d10597e96474e4649e4a89c1595" }, "downloads": -1, "filename": "bourbaki.regex-0.1.4-py3-none-any.whl", "has_sig": false, "md5_digest": "bda3ee201695a168acf76b31ec8d7679", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 18232, "upload_time": "2019-08-08T20:17:15", "url": "https://files.pythonhosted.org/packages/69/79/cb29575c6b2e021d0e5fd8aca0330dc163129cb1a48e7fedd7ba7a2d4c6b/bourbaki.regex-0.1.4-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "df42c856e918331f955772b1d55bae9b", "sha256": "12c0c9828d7c9a933b8c8bfd437a65e4ea11358e2e1689d31004731f5f4d8239" }, "downloads": -1, "filename": "bourbaki.regex-0.1.4.tar.gz", "has_sig": false, "md5_digest": "df42c856e918331f955772b1d55bae9b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 21266, "upload_time": "2019-08-08T20:17:16", "url": "https://files.pythonhosted.org/packages/a7/65/88c3b48ec7a0aabd0b093b9519f33b4d542caa6ff23ccd19acfe4dac82d9/bourbaki.regex-0.1.4.tar.gz" } ], "0.1.5": [ { "comment_text": "", "digests": { "md5": "97ac2984506fb659e15de31634a09ea6", "sha256": "478bd6345b3c3495aa59c8c92a32cd1b9d9692f07b38d963c5492377f255bd7e" }, "downloads": -1, "filename": "bourbaki.regex-0.1.5-py3-none-any.whl", "has_sig": false, "md5_digest": "97ac2984506fb659e15de31634a09ea6", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 18233, "upload_time": "2019-08-08T21:09:19", "url": "https://files.pythonhosted.org/packages/df/12/49d8be82d7e1c6230f6eb2c60d0141b02affaa6edd577312e354b9bcecb6/bourbaki.regex-0.1.5-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "aae3e98443d45b913cb4b334b0eecab3", "sha256": "81861315987ae4d48d12bb7f33e253f45fac661241d2d524f92489f4d464c5fd" }, "downloads": -1, "filename": "bourbaki.regex-0.1.5.tar.gz", "has_sig": false, "md5_digest": "aae3e98443d45b913cb4b334b0eecab3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 21307, "upload_time": "2019-08-08T21:09:20", "url": "https://files.pythonhosted.org/packages/a5/1f/894aae3896fea9a3c810ec300378899ff848a6c12ef5707dd0de807cbbf0/bourbaki.regex-0.1.5.tar.gz" } ], "0.1.6": [ { "comment_text": "", "digests": { "md5": "9ab73f750857014bcee25f716aac2be1", "sha256": "a4dacce40fee30510a62ebca34d098eb9a54f3a94d4d0682525e7728770a01d7" }, "downloads": -1, "filename": "bourbaki.regex-0.1.6-py3-none-any.whl", "has_sig": false, "md5_digest": "9ab73f750857014bcee25f716aac2be1", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 18292, "upload_time": "2019-08-08T21:58:16", "url": "https://files.pythonhosted.org/packages/8d/45/aad1be8e8b3d46f15b526278662feb62df3bdd5b1ab8a5a030e925370667/bourbaki.regex-0.1.6-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "cd1d5d3bd03fddf1557d30eabb56d8c4", "sha256": "707e63566d669e726308c4c09870bd0fe67d5486de2ed13a49be18995d347184" }, "downloads": -1, "filename": "bourbaki.regex-0.1.6.tar.gz", "has_sig": false, "md5_digest": "cd1d5d3bd03fddf1557d30eabb56d8c4", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 21374, "upload_time": "2019-08-08T21:58:18", "url": "https://files.pythonhosted.org/packages/29/1b/3692d6373949c3bb380660d1d10aeda24dac2e69d19c184df9e6bb134eb0/bourbaki.regex-0.1.6.tar.gz" } ], "0.1.7": [ { "comment_text": "", "digests": { "md5": "636aaf26afe4f1bf30649296efee17c8", "sha256": "ad81619520d4329e8d7102a595a1d38fc635a41229f6f38f44b7e22eb96d4a6d" }, "downloads": -1, "filename": "bourbaki.regex-0.1.7-py3-none-any.whl", "has_sig": false, "md5_digest": "636aaf26afe4f1bf30649296efee17c8", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 18318, "upload_time": "2019-08-08T23:03:11", "url": "https://files.pythonhosted.org/packages/c6/b5/274ef55cb3b56e3daa555ad6bc8875151432b97a7551d714f98537274119/bourbaki.regex-0.1.7-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "fa25f21cf2613ba9a65b098f4f7bf2a8", "sha256": "c544f23cf7b448629e754c00a5bed87ef39270927aed54694bb5886addea4cac" }, "downloads": -1, "filename": "bourbaki.regex-0.1.7.tar.gz", "has_sig": false, "md5_digest": "fa25f21cf2613ba9a65b098f4f7bf2a8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 21468, "upload_time": "2019-08-08T23:03:12", "url": "https://files.pythonhosted.org/packages/e6/d1/8a1e5fb80592ebf8c11ec2c28826bb6fc644727dd4f2890b907bf1439dac/bourbaki.regex-0.1.7.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "636aaf26afe4f1bf30649296efee17c8", "sha256": "ad81619520d4329e8d7102a595a1d38fc635a41229f6f38f44b7e22eb96d4a6d" }, "downloads": -1, "filename": "bourbaki.regex-0.1.7-py3-none-any.whl", "has_sig": false, "md5_digest": "636aaf26afe4f1bf30649296efee17c8", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 18318, "upload_time": "2019-08-08T23:03:11", "url": "https://files.pythonhosted.org/packages/c6/b5/274ef55cb3b56e3daa555ad6bc8875151432b97a7551d714f98537274119/bourbaki.regex-0.1.7-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "fa25f21cf2613ba9a65b098f4f7bf2a8", "sha256": "c544f23cf7b448629e754c00a5bed87ef39270927aed54694bb5886addea4cac" }, "downloads": -1, "filename": "bourbaki.regex-0.1.7.tar.gz", "has_sig": false, "md5_digest": "fa25f21cf2613ba9a65b098f4f7bf2a8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 21468, "upload_time": "2019-08-08T23:03:12", "url": "https://files.pythonhosted.org/packages/e6/d1/8a1e5fb80592ebf8c11ec2c28826bb6fc644727dd4f2890b907bf1439dac/bourbaki.regex-0.1.7.tar.gz" } ] }