{ "info": { "author": "Alex Perry", "author_email": "alex.perry@google.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 5 - Production/Stable", "Intended Audience :: Developers", "License :: OSI Approved :: Apache Software License", "Operating System :: OS Independent" ], "description": "=========\nsre_yield\n=========\n\nQuick Start\n===========\n\nThe goal of ``sre_yield`` is to efficiently generate all values that can match a\ngiven regular expression, or count possible matches efficiently. It uses the\nparsed regular expression, so you get a much more accurate result than trying\nto just split strings.\n\n.. code-block:: pycon\n\n >>> s = 'foo|ba[rz]'\n >>> s.split('|') # bad\n ['foo', 'ba[rz]']\n\n >>> import sre_yield\n >>> list(sre_yield.AllStrings(s)) # better\n ['foo', 'bar', 'baz']\n\nIt does this by walking the tree as constructed by ``sre_parse`` (same thing\nused internally by the ``re`` module), and constructing chained/repeating\niterators as appropriate. There may be duplicate results, depending on your\ninput string though -- these are cases that ``sre_parse`` did not optimize.\n\n.. code-block:: pycon\n\n >>> import sre_yield\n >>> list(sre_yield.AllStrings('.|a', charset='ab'))\n ['a', 'b', 'a']\n\n...and happens in simpler cases too:\n\n.. code-block:: pycon\n\n >>> list(sre_yield.AllStrings('a|a'))\n ['a', 'a']\n\n\nQuirks\n======\n\nThe membership check, ``'abc' in values_obj`` is by necessity fullmatch -- it\nmust cover the entire string. Imagine that it has ``^(...)$`` around it.\nBecause ``re.search`` can match anywhere in an arbitrarily string, emulating\nthis would produce a large number of junk matches -- probably not what you\nwant. (If that is what you want, add a ``.*`` on either side.)\n\nHere's a quick example, using the presidents regex from http://xkcd.com/1313/\n\n.. code-block:: pycon\n\n >>> s = 'bu|[rn]t|[coy]e|[mtg]a|j|iso|n[hl]|[ae]d|lev|sh|[lnd]i|[po]o|ls'\n\n >>> import re\n >>> re.search(s, 'kennedy') is not None # note .search\n True\n >>> v = sre_yield.AllStrings(s)\n >>> v.__len__()\n 23\n >>> 'bu' in v\n True\n >>> v[:5]\n ['bu', 'rt', 'nt', 'ce', 'oe']\n\nIf you do want to emulate search, you end up with a large number of matches\nquickly. Limiting the repetition a bit helps, but it's still a very large\nnumber.\n\n.. code-block:: pycon\n\n >>> v2 = sre_yield.AllStrings('.{,30}(' + s + ').{,30}')\n >>> el = v2.__len__() # too big for int\n >>> print(str(el).rstrip('L'))\n 57220492262913872576843611006974799576789176661653180757625052079917448874638816841926032487457234703154759402702651149752815320219511292208238103\n >>> 'kennedy' in v2\n True\n\n\nCapturing Groups\n================\n\nIf you're interested in extracting what would match during generation of a\nvalue, you can use AllMatches instead to get Match objects.\n\n.. code-block:: pycon\n\n >>> v = sre_yield.AllMatches(r'a(\\d)b')\n >>> m = v[0]\n >>> m.group(0)\n 'a0b'\n >>> m.group(1)\n '0'\n\nThis even works for simplistic backreferences, in this case to have matching quotes.\n\n.. code-block:: pycon\n\n >>> v = sre_yield.AllMatches(r'([\"\\'])([01]{3})\\1')\n >>> m = v[0]\n >>> m.group(0)\n '\"000\"'\n >>> m.groups()\n ('\"', '000')\n >>> m.group(1)\n '\"'\n >>> m.group(2)\n '000'\n\n\nReporting Bugs, etc.\n====================\n\nWe welcome bug reports -- see our issue tracker on `GitHub\n`_ to see if it's been reported before.\nIf you'd like to discuss anything, we have a `Google Group\n`_ as well.\n\n\nRelated Modules\n===============\n\nWe're aware of three similar modules, but each has a different goal.\n\n\nxeger\n-----\n\nXeger was originally written `in Java `_ and\nported `to Python `_. This\ngenerates random entries, which may suffice if you want to get just a few\nmatching values. This module and ``xeger`` differ statistically in the way\nthey handle repetitions:\n\n.. code-block:: pycon\n\n >>> import random\n >>> v = sre_yield.AllStrings('[abc]{1,4}')\n >>> len(v)\n 120\n\n # Now random.choice(v) has a 3/120 chance of choosing a single letter.\n >>> len([x for x in v if len(x) == 1])\n 3\n\n # xeger(v) has ~25% chance of choosing a single letter, because the length\n and match are chosen independently.\n # (This doesn't run, so the doctests don't require xeger)\n > from rstr import xeger\n > sum([1 if len(xeger('[abc]{1,4}')) == 1 else 0 for _ in range(120)])\n 26\n\nIn addition, ``xeger`` differs in the default matching of ``'.'`` is for\nprintable characters (which you can get by setting ``charset=string.printable``\nin ``sre_yield``, if desired).\n\n\nsre_dump\n--------\n\nAnother module that walks ``sre_parse``'s tree is ``sre_dump``, although it\ndoes not generate matches, only reconstructs the string pattern (useful\nprimarily if you hand-generate a tree). If you're interested in the space,\nit's a good read. http://www.dalkescientific.com/Python/sre_dump.html\n\n\njpetkau1\n--------\n\nCan find matches by using randomization, so sort of handles anchors. Not\nguaranteed though, but another good look at internals.\nhttp://web.archive.org/web/20071024164712/http://www.uselesspython.com/jpetkau1.py\n(and slightly older version in the announcement on `python-list\n`_).\n\n\nDifferences between sre_yield and the re module\n===============================================\n\nThere are certainly valid regular expressions which ``sre_yield`` does not\nhandle. These include things like lookarounds, backreferences, but also a few\nother exceptions:\n\n- The maximum value for repeats is system-dependant -- CPython's ``sre`` module\n there's a special value which is treated as infinite (either 2**16-1 or\n 2**32-1 depending on build). In sre_yield, this is taken as a literal,\n rather than infinite, thus (on a 2**16-1 platform):\n\n .. code-block:: pycon\n\n >>> len(sre_yield.AllStrings('a*')[-1])\n 65535\n >>> import re\n >>> len(re.match('.*', 'a' * 100000).group(0))\n 100000\n\n- The ``re`` `module docs `_\n say \"Regular expression pattern strings may not contain null bytes\"\n yet this appears to work fine.\n- Order does not depend on greediness.\n- The regex is treated as fullmatch.\n- ``sre_yield`` is confused by complex uses of anchors, but support simple ones:\n\n .. code-block:: pycon\n\n >>> list(sre_yield.AllStrings('foo$'))\n ['foo']\n >>> list(sre_yield.AllStrings('^$'))\n ['']\n >>> list(sre_yield.AllStrings('.\\\\b.')) # doctest: +IGNORE_EXCEPTION_DETAIL\n Traceback (most recent call last):\n ...\n ParseError: Non-end-anchor None found at END state\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/google/sre_yield", "keywords": "", "license": "Apache", "maintainer": "", "maintainer_email": "", "name": "sre-yield", "package_url": "https://pypi.org/project/sre-yield/", "platform": "", "project_url": "https://pypi.org/project/sre-yield/", "project_urls": { "Homepage": "https://github.com/google/sre_yield" }, "release_url": "https://pypi.org/project/sre-yield/1.2/", "requires_dist": null, "requires_python": "", "summary": "Expands a regular expression to its possible matches", "version": "1.2" }, "last_serial": 5659617, "releases": { "1.0": [ { "comment_text": "", "digests": { "md5": "18adecb75734d1af94ddda79185dfcdb", "sha256": "0e5413ae8a56009d993883ded321746f7b8c828d56973d6cb1b67e344c37574d" }, "downloads": -1, "filename": "sre_yield-1.0.tar.gz", "has_sig": false, "md5_digest": "18adecb75734d1af94ddda79185dfcdb", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 18072, "upload_time": "2014-04-14T13:01:21", "url": "https://files.pythonhosted.org/packages/49/ff/0bdd5a1f60e871444147cf4f015174b6418ec9232ea7918e052fe5008063/sre_yield-1.0.tar.gz" } ], "1.1": [ { "comment_text": "", "digests": { "md5": "9acf402c9119d27426807e8ae9c356be", "sha256": "4234406c647a8f71e8f58cf604470094d4999cb2e36a65aa4622dec666d1aa03" }, "downloads": -1, "filename": "sre_yield-1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "9acf402c9119d27426807e8ae9c356be", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 26570, "upload_time": "2018-11-16T04:48:20", "url": "https://files.pythonhosted.org/packages/ee/52/0d00d3eb5cf460e130e578854bbd45b426f6ab7dd310906be135e67daa58/sre_yield-1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "800bf53c723a568f6c8e14f95f41dce0", "sha256": "50366674be3ddbf8bec651737b4c4eba92da97cd3626c3d88f234673d986b913" }, "downloads": -1, "filename": "sre_yield-1.1.tar.gz", "has_sig": false, "md5_digest": "800bf53c723a568f6c8e14f95f41dce0", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 19642, "upload_time": "2018-11-16T04:48:21", "url": "https://files.pythonhosted.org/packages/1f/79/ab3cfe84d7b699d3146296d79835f8b8a99c5dccf1eeb0e7f8a159494316/sre_yield-1.1.tar.gz" } ], "1.2": [ { "comment_text": "", "digests": { "md5": "1c7244252285550a6b5a84f86308c738", "sha256": "5747dc54435ede1890f7c046c8d4e0053cef41a973a2b78170222df0252e9efa" }, "downloads": -1, "filename": "sre_yield-1.2-py3-none-any.whl", "has_sig": false, "md5_digest": "1c7244252285550a6b5a84f86308c738", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 27766, "upload_time": "2019-08-10T14:52:16", "url": "https://files.pythonhosted.org/packages/19/f2/dd65830662f61afc67df940446407766cc4d529de78b1e3ecc4c3c862a44/sre_yield-1.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "8627ef9bb9386014b800d0a83eb96e61", "sha256": "e94f1a2a3cbafffe1dcd15c1d54e401a1517e052aa64c7d3164f88dc761d7b8a" }, "downloads": -1, "filename": "sre_yield-1.2.tar.gz", "has_sig": false, "md5_digest": "8627ef9bb9386014b800d0a83eb96e61", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20724, "upload_time": "2019-08-10T14:52:18", "url": "https://files.pythonhosted.org/packages/f2/a6/4588fe0f6c6e3b03ddbfc2bd97227adfcf5ad0f49f79828529ab4d580eeb/sre_yield-1.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "1c7244252285550a6b5a84f86308c738", "sha256": "5747dc54435ede1890f7c046c8d4e0053cef41a973a2b78170222df0252e9efa" }, "downloads": -1, "filename": "sre_yield-1.2-py3-none-any.whl", "has_sig": false, "md5_digest": "1c7244252285550a6b5a84f86308c738", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 27766, "upload_time": "2019-08-10T14:52:16", "url": "https://files.pythonhosted.org/packages/19/f2/dd65830662f61afc67df940446407766cc4d529de78b1e3ecc4c3c862a44/sre_yield-1.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "8627ef9bb9386014b800d0a83eb96e61", "sha256": "e94f1a2a3cbafffe1dcd15c1d54e401a1517e052aa64c7d3164f88dc761d7b8a" }, "downloads": -1, "filename": "sre_yield-1.2.tar.gz", "has_sig": false, "md5_digest": "8627ef9bb9386014b800d0a83eb96e61", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20724, "upload_time": "2019-08-10T14:52:18", "url": "https://files.pythonhosted.org/packages/f2/a6/4588fe0f6c6e3b03ddbfc2bd97227adfcf5ad0f49f79828529ab4d580eeb/sre_yield-1.2.tar.gz" } ] }