{ "info": { "author": "Cameron Simpson", "author_email": "cs@cskk.id.au", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Intended Audience :: Developers", "License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)", "Operating System :: OS Independent", "Programming Language :: Python", "Programming Language :: Python :: 2", "Programming Language :: Python :: 3", "Topic :: Software Development :: Libraries :: Python Modules" ], "description": "Release 20190812:\nFix bad slosh escapes in strings.\n\nLexical analysis functions, tokenisers.\n\nAn arbitrary assortment of lexical and tokenisation functions useful\nfor writing recursive descent parsers, of which I have several.\n\nGenerally the get_* functions accept a source string and an offset\n(usually optional, default 0) and return a token and the new offset,\nraising ValueError on failed tokenisation.\n\n## Function `as_lines(chunks, partials=None)`\n\nGenerator yielding complete lines from arbitrary pieces of text from\nthe iterable `chunks`.\n\nAfter completion, any remaining newline-free chunks remain\nin the partials list; this will be unavailable to the caller\nunless the list is presupplied.\n\n## Function `get_chars(s, offset, gochars)`\n\nScan the string `s` for characters in `gochars` starting at `offset`.\nReturn (match, new_offset).\n\n## Function `get_decimal(s, offset=0)`\n\nScan the string `s` for decimal characters starting at `offset`.\nReturn (dec_string, new_offset).\n\n## Function `get_decimal_or_float_value(s, offset=0)`\n\nFetch a decimal or basic float (nnn.nnn) value\nfrom the str `s` at `offset`.\nReturn (value, new_offset).\n\n## Function `get_decimal_value(s, offset=0)`\n\nScan the string `s` for a decimal value starting at `offset`.\nReturn (value, new_offset).\n\n## Function `get_delimited(s, offset, delim)`\n\nCollect text from the string `s` from position `offset` up\nto the first occurence of delimiter `delim`; return the text\nexcluding the delimiter and the offset after the delimiter.\n\n## Function `get_dotted_identifier(s, offset=0, **kw)`\n\nScan the string `s` for a dotted identifier (by default an\nASCII letter or underscore followed by letters, digits or\nunderscores) with optional trailing dot and another dotted\nidentifier, starting at `offset` (default 0).\nReturn (match, new_offset).\n\nNote: the empty string and an unchanged offset will be returned if\nthere is no leading letter/underscore.\n\n## Function `get_envvar(s, offset=0, environ=None, default=None, specials=None)`\n\nParse a simple environment variable reference to $varname or\n$x where \"x\" is a special character.\n\nParameters:\n* `s`: the string with the variable reference\n* `offset`: the starting point for the reference\n* `default`: default value for missing environment variables;\n if None (the default) a ValueError is raised\n* `environ`: the environment mapping, default os.environ\n* `specials`: the mapping of special single character variables\n\n## Function `get_hexadecimal(s, offset=0)`\n\nScan the string `s` for hexadecimal characters starting at `offset`.\nReturn hex_string, new_offset.\n\n## Function `get_hexadecimal_value(s, offset=0)`\n\nScan the string `s` for a hexadecimal value starting at `offset`.\nReturn (value, new_offset).\n\n## Function `get_identifier(s, offset=0, alpha='abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ', number='0123456789', extras='_')`\n\nScan the string `s` for an identifier (by default an ASCII\nletter or underscore followed by letters, digits or underscores)\nstarting at `offset` (default 0).\nReturn (match, new_offset).\n\nNote: the empty string and an unchanged offset will be returned if\nthere is no leading letter/underscore.\n\nParameters:\n* `s`: the string to scan\n* `offset`: the starting offset, default 0.\n* `alpha`: the characters considered alphabetic,\n default `string.ascii_letters`.\n* `number`: the characters considered numeric,\n default `string.digits`.\n* `extras`: extra characters considered part of an identifier,\n default `'_'`.\n\n## Function `get_nonwhite(s, offset=0)`\n\nScan the string `s` for characters not in string.whitespace\nstarting at `offset` (default 0).\nReturn (match, new_offset).\n\n## Function `get_other_chars(s, offset=0, stopchars=None)`\n\nScan the string `s` for characters not in `stopchars` starting\nat `offset` (default 0).\nReturn (match, new_offset).\n\n## Function `get_qstr(s, offset=0, q='\"', environ=None, default=None, env_specials=None)`\n\nGet quoted text with slosh escapes and optional environment substitution.\n\nParameters:\n* `s`: the string containg the quoted text.\n* `offset`: the starting point, default 0.\n* `q`: the quote character, default `'\"'`. If `q` is set to `None`,\n do not expect the string to be delimited by quote marks.\n* `environ`: if not `None`, also parse and expand $envvar references.\n* `default`: passed to `get_envvar`\n\n## Function `get_qstr_or_identifier(s, offset)`\n\nParse a double quoted string or an identifier.\n\n## Function `get_sloshed_text(s, delim, offset=0, slosh='\\\\', mapper=, specials=None)`\n\nCollect slosh escaped text from the string `s` from position\n`offset` (default 0) and return the decoded unicode string and\nthe offset of the completed parse.\n\nParameters:\n* `delim`: end of string delimiter, such as a single or double quote.\n* `offset`: starting offset within `s`, default 0.\n* `slosh`: escape character, default a slosh ('\\').\n* `mapper`: a mapping function which accepts a single character\n and returns a replacement string or `None`; this is used the\n replace things such as '\\t' or '\\n'. The default is the\n `slosh_mapper` function, whose default mapping is `SLOSH_CHARMAP`.\n* `specials`: a mapping of other special character sequences and parse\n functions for gathering them up. When one of the special\n character sequences is found in the string, the parse\n function is called to parse at that point.\n The parse functions accept\n `s` and the offset of the special character. They return\n the decoded string and the offset past the parse.\n\nThe escape character `slosh` introduces an encoding of some\nreplacement text whose value depends on the following character.\nIf the following character is:\n* the escape character `slosh`, insert the escape character.\n* the string delimiter `delim`, insert the delimiter.\n* the character 'x', insert the character with code from the following\n 2 hexadecimal digits.\n* the character 'u', insert the character with code from the following\n 4 hexadecimal digits.\n* the character 'U', insert the character with code from the following\n 8 hexadecimal digits.\n* a character from the keys of `mapper`\n\n## Function `get_tokens(s, offset, getters)`\n\nParse the string `s` from position `offset` using the supplied\ntokenise functions `getters`; return the list of tokens matched\nand the final offset.\n\nParameters:\n* `s`: the string to parse.\n* `offset`: the starting position for the parse.\n* `getters`: an iterable of tokeniser specifications.\n\nEach tokeniser specification is either:\n* a callable expecting (s, offset) and returning (token, new_offset)\n* a literal string, to be matched exactly\n* a tuple or list with values (func, args, kwargs);\n call func(s, offset, *args, **kwargs)\n* an object with a .match method such as a regex;\n call getter.match(s, offset) and return a match object with\n a .end() method returning the offset of the end of the match\n\n## Function `get_uc_identifier(s, offset=0, number='0123456789', extras='_')`\n\nScan the string `s` for an identifier as for get_identifier(),\nbut require the letters to be uppercase.\n\n## Function `get_white(s, offset=0)`\n\nScan the string `s` for characters in string.whitespace\nstarting at `offset` (default 0).\nReturn (match, new_offset).\n\n## Function `hexify(bs)`\n\nA Python 2 flavour of binascii.hexlify.\n\n## Function `htmlify(s, nbsp=False)`\n\nConvert a string for safe transcription in HTML.\n\nParameters:\n* `s`: the string\n* `nbsp`: replaces spaces with `\" \"` to prevent word folding,\n default `False`.\n\n## Function `htmlquote(s)`\n\nQuote a string for use in HTML.\n\n## Function `is_dotted_identifier(s, offset=0, **kw)`\n\nTest if the string `s` is an identifier from position `offset` onward.\n\n## Function `is_identifier(s, offset=0, **kw)`\n\nTest if the string `s` is an identifier from position `offset` onward.\n\n## Function `isUC_(s)`\n\nCheck that a string matches `^[A-Z][A-Z_0-9]*$`.\n\n## Function `jsquote(s)`\n\nQuote a string for use in JavaScript.\n\n## Function `lastlinelen(s)`\n\nThe length of text after the last newline in a string.\n\n(Initially used by cs.hier to compute effective text width.)\n\n## Function `match_tokens(s, offset, getters)`\n\nWrapper for get_tokens which catches ValueError exceptions\nand returns (None, offset).\n\n## Function `parseUC_sAttr(attr)`\n\nTake an attribute name and return `(key, is_plural)`.\n\n`'FOO'` returns `(`FOO`, False)`.\n`'FOOs'` or `'FOOes'` returns `('FOO', True)`.\nOtherwise return `(None, False)`.\n\n## Function `phpquote(s)`\n\nQuote a string for use in PHP code.\n\n## Function `skipwhite(s, offset=0)`\n\nConvenience routine for skipping past whitespace;\nreturns the offset of the next nonwhitespace character.\n\n## Function `slosh_mapper(c, charmap=None)`\n\nReturn a string to replace backslash-`c`, or None.\n\n## Function `stripped_dedent(s)`\n\nSlightly smarter dedent which ignores a string's opening indent.\n\nStrip the supplied string `s`. Pull off the leading line.\nDedent the rest. Put back the leading line.\n\nExample:\n\n >>> def func(s):\n ... \"\"\" Slightly smarter dedent which ignores a string's opening indent.\n ... Strip the supplied string `s`. Pull off the leading line.\n ... Dedent the rest. Put back the leading line.\n ... \"\"\"\n ... pass\n ...\n >>> from cs.lex import stripped_dedent\n >>> print(stripped_dedent(func.__doc__))\n Slightly smarter dedent which ignores a string's opening indent.\n Strip the supplied string `s`. Pull off the leading line.\n Dedent the rest. Put back the leading line.\n\n## Function `strlist(ary, sep=', ')`\n\nConvert an iterable to strings and join with \", \".\n\n## Function `tabpadding(padlen, tabsize=8, offset=0)`\n\nCompute some spaces to use a tab padding at an offfset.\n\n## Function `texthexify(bs, shiftin='[', shiftout=']', whitelist=None)`\n\nTranscribe the bytes `bs` to text using compact text runs for\nsome common text values.\n\nThis can be reversed with the `untexthexify` function.\n\nThis is an ad doc format devised to be compact but also to\nexpose \"text\" embedded within to the eye. The original use\ncase was transcribing a binary directory entry format, where\nthe filename parts would be somewhat visible in the transcription.\n\nThe output is a string of hexadecimal digits for the encoded\nbytes except for runs of values from the whitelist, which are\nenclosed in the shiftin and shiftout markers and transcribed\nas is. The default whitelist is values of the ASCII letters,\nthe decimal digits and the punctuation characters '_-+.,'.\nThe default shiftin and shiftout markers are '[' and ']'.\n\nExample:\n\n >>> texthexify(b'&^%&^%abcdefghi)(*)(*')\n '265e25265e25[abcdefghi]29282a29282a'\n\nParameters:\n* `bs`: the bytes to transcribe\n* `shiftin`: Optional. The marker string used to indicate a shift to\n direct textual transcription of the bytes, default: `'['`.\n* `shiftout`: Optional. The marker string used to indicate a\n shift from text mode back into hexadecimal transcription,\n default `']'`.\n* `whitelist`: an optional bytes or string object indicating byte\n values which may be represented directly in text; string objects are\n converted to hexify() and texthexify() output strings may be freely\n concatenated and decoded with untexthexify().\n The default value is the ASCII letters, the decimal digits\n and the punctuation characters '_-+.,'.\n\n## Function `unctrl(s, tabsize=8)`\n\nReturn the string `s` with TABs expanded and control characters\nreplaced with printable representations.\n\n## Function `untexthexify(s, shiftin='[', shiftout=']')`\n\nDecode a textual representation of binary data into binary data.\n\nThis is the reverse of the `texthexify` function.\n\nOutside of the `shiftin`/`shiftout` markers the binary data\nare represented as hexadecimal. Within the markers the bytes\nhave the values of the ordinals of the characters.\n\nExample:\n\n >>> untexthexify('265e25265e25[abcdefghi]29282a29282a')\n b'&^%&^%abcdefghi)(*)(*'\n\nParameters:\n* `s`: the string containing the text representation.\n* `shiftin`: Optional. The marker string commencing a sequence\n of direct text transcription, default `'['`.\n* `shiftout`: Optional. The marker string ending a sequence\n of direct text transcription, default `']'`.\n\n\n\n# Release Log\n\nRelease 20190812:\nFix bad slosh escapes in strings.\n\nRelease 20190220:\nNew function get_qstr_or_identifier.\n\nRelease 20181108:\nnew function get_decimal_or_float_value to read a decimal or basic float\n\nRelease 20180815:\nNo semantic changes; update some docstrings and clean some lint, fix a unit test.\n\nRelease 20180810:\nNew get_decimal_value and get_hexadecimal_value functions.\nNew stripped_dedent function, a slightly smarter textwrap.dedent.\n\nRelease 20171231:\nNew function get_decimal. Drop unused function dict2js.\n\nRelease 20170904:\nPython 2/3 ports, move rfc2047 into new cs.rfc2047 module.\n\nRelease 20160828:\nUse \"install_requires\" instead of \"requires\" in DISTINFO.\nDiscard str1(), pointless optimisation.\nunrfc2047: map _ to SPACE, improve exception handling.\nAdd phpquote: quote a string for use in PHP code; add docstring to jsquote.\nAdd is_identifier test.\nAdd get_dotted_identifier.\nAdd is_dotted_identifier.\nAdd get_hexadecimal.\nAdd skipwhite, convenince wrapper for get_white returning just the next offset.\nAssorted bugfixes and improvements.\n\nRelease 20150120:\ncs.lex: texthexify: backport to python 2 using cs.py3 bytes type\n\nRelease 20150118:\nmetadata updates\n\nRelease 20150116:\nPyPI metadata and slight code cleanup.", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://bitbucket.org/cameron_simpson/css/commits/all", "keywords": "python2,python3", "license": "GNU General Public License v3 or later (GPLv3+)", "maintainer": "", "maintainer_email": "", "name": "cs.lex", "package_url": "https://pypi.org/project/cs.lex/", "platform": "", "project_url": "https://pypi.org/project/cs.lex/", "project_urls": { "Homepage": "https://bitbucket.org/cameron_simpson/css/commits/all" }, "release_url": "https://pypi.org/project/cs.lex/20190812/", "requires_dist": null, "requires_python": "", "summary": "Lexical analysis functions, tokenisers.", "version": "20190812" }, "last_serial": 5663484, "releases": { "20150118": [ { "comment_text": "", "digests": { "md5": "532ce38899726a27d8c0910f751e5e08", "sha256": "fbd1f673a03a36361e0b126241b9e12c18e79d844b6a64ae248d2b2986932500" }, "downloads": -1, "filename": "cs.lex-20150118.tar.gz", "has_sig": false, "md5_digest": "532ce38899726a27d8c0910f751e5e08", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8005, "upload_time": "2015-01-18T05:58:10", "url": "https://files.pythonhosted.org/packages/8d/3f/519781144aa6b9b603696365283c7c240458f4e3286013cf9bcc6c00ef00/cs.lex-20150118.tar.gz" } ], "20150120": [ { "comment_text": "", "digests": { "md5": "b8ed5aede3eb8b527acd72417413ef31", "sha256": "84efcc0725cf99fe953148c49c7d0e8c7e283c244f0ea039f6decfac54c98235" }, "downloads": -1, "filename": "cs.lex-20150120.tar.gz", "has_sig": false, "md5_digest": "b8ed5aede3eb8b527acd72417413ef31", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8142, "upload_time": "2015-01-19T23:36:47", "url": "https://files.pythonhosted.org/packages/b2/5f/83512d8dcd5e5d92bca9a048f2749863fa0f3f2964b4e0c18de761b72f47/cs.lex-20150120.tar.gz" } ], "20160828": [ { "comment_text": "", "digests": { "md5": "b22839b534cc0deb9d98dd9cef5b2166", "sha256": "b93062cb06598d82335df57268fc27ca49420f07f87c04a3a58019170d6d5a48" }, "downloads": -1, "filename": "cs.lex-20160828.tar.gz", "has_sig": false, "md5_digest": "b22839b534cc0deb9d98dd9cef5b2166", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8951, "upload_time": "2016-08-28T06:07:18", "url": "https://files.pythonhosted.org/packages/b7/a5/1ddfcf4304d3f40897bce3701053b8521b85d82dc4dfe815321672f184ef/cs.lex-20160828.tar.gz" } ], "20170904": [ { "comment_text": "", "digests": { "md5": "4d318811bdb1f9a761491a75d029b547", "sha256": "3987b53546cfbab46c188bf8b83fa70b4fa311e1fd8256714e636abacd2254a2" }, "downloads": -1, "filename": "cs.lex-20170904.tar.gz", "has_sig": false, "md5_digest": "4d318811bdb1f9a761491a75d029b547", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8278, "upload_time": "2017-09-04T01:07:12", "url": "https://files.pythonhosted.org/packages/c5/7c/a952ec94ddaf052b22f4fb2e274c60716cfe6e3727c966f5ee0affc37f1c/cs.lex-20170904.tar.gz" } ], "20171231": [ { "comment_text": "", "digests": { "md5": "8f4e597627284b45f8ce8fbb3d6c9201", "sha256": "1d8a58aa5f6d1e799d63264554f3bb5831149957d42ae9dc58c586dcf557a0f3" }, "downloads": -1, "filename": "cs.lex-20171231.tar.gz", "has_sig": false, "md5_digest": "8f4e597627284b45f8ce8fbb3d6c9201", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8330, "upload_time": "2017-12-30T22:35:53", "url": "https://files.pythonhosted.org/packages/20/2f/934c98939143a6e6d0068366ea3940508640e0253616c1a107c05aeff372/cs.lex-20171231.tar.gz" } ], "20180810": [ { "comment_text": "", "digests": { "md5": "2011039a6649e7c2d22fdf48bba793ea", "sha256": "5feb4b40ec82987c02b6d4543885f5c0d9d26cb5e37e9327879c8c152c0386cb" }, "downloads": -1, "filename": "cs.lex-20180810.tar.gz", "has_sig": false, "md5_digest": "2011039a6649e7c2d22fdf48bba793ea", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9679, "upload_time": "2018-08-10T01:53:27", "url": "https://files.pythonhosted.org/packages/a6/d0/b46b824f86b6e962807711b9afb4fe37dfc9c317d8eb6d3648ff82de89a4/cs.lex-20180810.tar.gz" } ], "20180815": [ { "comment_text": "", "digests": { "md5": "4d86ff4feda881ba78be5cdecd47828d", "sha256": "5c9e12dc09218525e0a9ade114095194b61f3e5b063a120f7b05b8d51818955c" }, "downloads": -1, "filename": "cs.lex-20180815.tar.gz", "has_sig": false, "md5_digest": "4d86ff4feda881ba78be5cdecd47828d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12030, "upload_time": "2018-08-14T22:52:33", "url": "https://files.pythonhosted.org/packages/8d/dc/f72474b72057d77fc496ceac38af10f06f8cb2302d7662a871b541ed0e40/cs.lex-20180815.tar.gz" } ], "20181108": [ { "comment_text": "", "digests": { "md5": "761c5f7e4d207a2e504eb5b44bda5bb5", "sha256": "d9be4e36e2dad5ed15781e8e731869bc1b479b36cec89aea7efc677efb12b095" }, "downloads": -1, "filename": "cs.lex-20181108.tar.gz", "has_sig": false, "md5_digest": "761c5f7e4d207a2e504eb5b44bda5bb5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12207, "upload_time": "2018-11-07T22:03:08", "url": "https://files.pythonhosted.org/packages/a0/ab/e4808898e4027fcb3e7850164133a66fad3c11322e70ed08bd8e741231ea/cs.lex-20181108.tar.gz" } ], "20190220": [ { "comment_text": "", "digests": { "md5": "7f00d2dd1eebeac175345b3af6eac019", "sha256": "4f0d8c4d4f6f24c5aa68f2bd99c61990f7dbd05b66067acaaacaef97c54fe9cd" }, "downloads": -1, "filename": "cs.lex-20190220.tar.gz", "has_sig": false, "md5_digest": "7f00d2dd1eebeac175345b3af6eac019", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12271, "upload_time": "2019-02-20T09:19:47", "url": "https://files.pythonhosted.org/packages/5a/4d/e9a6bbfb07d39c57eb1cc37568572aab322def4b6c3e2669f462ed80ae36/cs.lex-20190220.tar.gz" } ], "20190812": [ { "comment_text": "", "digests": { "md5": "1f23977a7c4bab25f775d4b14802f9c2", "sha256": "d37baf562a84fb3e65ac1d81b9701ce8fddc12b1aabc8d69b012859ea8d81811" }, "downloads": -1, "filename": "cs.lex-20190812.tar.gz", "has_sig": false, "md5_digest": "1f23977a7c4bab25f775d4b14802f9c2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 16395, "upload_time": "2019-08-11T23:36:04", "url": "https://files.pythonhosted.org/packages/bd/1d/1b62da36c3dc21a2f6d3ddf4f6622e0e23c76a35a47cccfbc51657f4b486/cs.lex-20190812.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "1f23977a7c4bab25f775d4b14802f9c2", "sha256": "d37baf562a84fb3e65ac1d81b9701ce8fddc12b1aabc8d69b012859ea8d81811" }, "downloads": -1, "filename": "cs.lex-20190812.tar.gz", "has_sig": false, "md5_digest": "1f23977a7c4bab25f775d4b14802f9c2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 16395, "upload_time": "2019-08-11T23:36:04", "url": "https://files.pythonhosted.org/packages/bd/1d/1b62da36c3dc21a2f6d3ddf4f6622e0e23c76a35a47cccfbc51657f4b486/cs.lex-20190812.tar.gz" } ] }