{ "info": { "author": "Dag Odenhall", "author_email": "dag.odenhall@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Intended Audience :: Developers", "Intended Audience :: Science/Research", "License :: OSI Approved :: BSD License", "Programming Language :: Java", "Programming Language :: Python", "Programming Language :: Python :: 2.6", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3.1", "Topic :: Scientific/Engineering :: Artificial Intelligence", "Topic :: Scientific/Engineering :: Human Machine Interfaces", "Topic :: Text Processing :: Linguistic" ], "description": "Python interface to camxes\n==========================\n\nTo install, you need a Java runtime environment as a ``java`` command on\nyour ``$PATH``, Python 2.6+ (including 3.1) and python-setuptools (or\ndistribute). Then you can simply install this package from PyPI with\n``easy_install`` or ``pip``, or as a dependency in your own ``setup.py``.\nThe parser itself is bundled with this package so you don't need to worry\nabout that.\n\n::\n\n easy_install camxes\n\n\nParsing Lojban\n--------------\n\nThe ``parse()`` function returns a parse tree of named nodes.\n\n>>> import camxes\n>>> print camxes.parse(\"coi rodo\")\ntext\n `- free\n +- CMAVO\n | `- COI\n | `- u'coi'\n `- sumti5\n +- CMAVO\n | `- PA\n | `- u'ro'\n `- CMAVO\n `- KOhA\n `- u'do'\n\nTurn a tree back into Lojban with the ``lojban`` property.\n\n>>> camxes.parse(\"coi rodo!\").lojban\nu'coi ro do'\n\nThis joins the leaf nodes with a space, but you can preserve spaces and\npunctuation by passing ``spaces=True`` to ``parse()``.\n\n>>> camxes.parse(\"coi rodo!\", spaces=True).lojban\nu'coi rodo!'\n\nChild nodes can be accessed by name as attributes, giving a list of such\nnodes. If there are no child nodes with that name an exception is raised.\n\n>>> print camxes.parse(\"coi rodo\").free[0].sumti5[0].CMAVO[1]\nCMAVO\n `- KOhA\n `- u'do'\n\nYou can also access nodes by sequential position without giving the name.\n\n>>> print camxes.parse(\"coi rodo\")[0][1]\nsumti5\n +- CMAVO\n | `- PA\n | `- u'ro'\n `- CMAVO\n `- KOhA\n `- u'do'\n\nNodes iterate over their children.\n\n>>> list(camxes.parse(\"coi rodo\")[0][1])\n[, ]\n\nThey also know their name.\n\n>>> camxes.parse(\"coi rodo\")[0][1].name\nu'sumti5'\n\n\nVerifying grammatical validity\n------------------------------\n\n``parse()`` is able to parse some ungrammatical input by processing as much\nas is grammatical. It is therefore unreliable for checking if some text is\ngrammatical. For this purpose, there is the ``isgrammatical()`` predicate.\n\n>>> camxes.isgrammatical(\"coi rodo\")\nTrue\n>>> camxes.isgrammatical(\"mupli cu fliba\")\nFalse\n>>> print camxes.parse(\"mupli cu fliba\")\ntext\n `- BRIVLA\n `- gismu\n `- u'mupli'\n\n\nDeconstructing compound words into affixes\n------------------------------------------\n\n``decompose()`` gives you the affixes and hyphens of a compound.\n\n>>> camxes.decompose(\"genturfa'i\")\n(u'gen', u'tur', u\"fa'i\")\n\nIt will complain for input that is not a single, valid compound.\n\n>>> camxes.decompose(\"camxes\")\nTraceback (most recent call last):\n File \"\", line 1, in \nValueError: invalid compound 'camxes'\n\n\nParsing only morphology\n-----------------------\n\nThe ``morphology()`` function works much like ``parse()``.\n\n>>> print camxes.morphology(\"coi\")\ntext\n `- CMAVO\n `- COI\n +- c\n | `- u'c'\n +- o\n | `- u'o'\n `- i\n `- u'i'\n\n\nTree traversal\n--------------\n\nSearch for nodes with the ``find()`` method. It takes any number of arguments\nthat are wildcard-matched against node names. This operation recurses down\neach branch until a match is found, but does not search children of\nmatching nodes.\n\n>>> camxes.parse(\"coi rodo\").find('sumti*')\n(,)\n\n>>> camxes.parse(\"coi rodo\").find('PA', 'KOhA')\n(, )\n\nKey access on nodes is a shortcut for the first match of a find.\n\n>>> camxes.parse(\"la camxes genturfa'i fi la lojban\")['cmene']\n\n\nThe ``leafs`` property is a tuple of all leaf nodes, which should be the\nunicode lexemes.\n\n>>> camxes.parse(\"coi rodo\").leafs\n(u'coi', u'ro', u'do')\n\nThe ``branches()`` method finds the parents of nodes whose leafs match the\narguments. This lets you search for the branches a sequence of lexemes\nbelong to.\n\n>>> camxes.parse(\"lo ninmu cu klama lo tcadu\").branches(\"lo\")\n(, )\n>>> camxes.parse(\"lo ninmu cu klama lo tcadu\").branches(\"ninmu\")\n(,)\n>>> camxes.parse(\"lo ninmu cu klama lo tcadu\").branches(\"klama\", \"lo\", \"tcadu\")\n(,)\n\nA generalization of these is called ``filter()`` and takes a predicate\nfunction that decides if a node should be listed. ``filter()`` is a\ngenerator so we use ``list()`` here to see the results.\n\n>>> leafparent = lambda node: not isinstance(node[0], camxes.Node)\n>>> list(camxes.parse(\"coi rodo\").filter(leafparent))\n[, , ]\n\n\nTree transformation\n-------------------\n\nYou can transform a node, recursively, into a tuple of strings, where the\nfirst item is the name of the node and the rest are the child nodes. This\nproperty is called ``primitive`` and can be useful if you're serializing a\nparse tree to a more \u201cdumb\u201d format such as JSON.\n\n>>> from pprint import pprint\n>>> pprint(camxes.parse(\"coi rodo\").primitive)\n(u'text',\n (u'free',\n (u'CMAVO', (u'COI', u'coi')),\n (u'sumti5', (u'CMAVO', (u'PA', u'ro')), (u'CMAVO', (u'KOhA', u'do')))))\n\n>>> import json\n>>> print json.dumps(camxes.parse(\"coi\").primitive, indent=2)\n[\n \"text\", \n [\n \"CMAVO\", \n [\n \"COI\", \n \"coi\"\n ]\n ]\n]\n\nThe generalization of ``primitive`` is called ``map()`` and takes a\ntransformer function that in turn takes a node. The transformation is then\nmapped recursively on all nodes and a nested tuple, similar to that of\n``primitive``, is returned.\n\n>>> camxes.parse(\"coi rodo\").map(len)\n(1, (2, (1, (1, 3)), (2, (1, (1, 2)), (1, (1, 2)))))", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/dag/python-camxes", "keywords": null, "license": "Simplified BSD", "maintainer": null, "maintainer_email": null, "name": "camxes", "package_url": "https://pypi.org/project/camxes/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/camxes/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://github.com/dag/python-camxes" }, "release_url": "https://pypi.org/project/camxes/0.2/", "requires_dist": null, "requires_python": null, "summary": "Python interface to camxes.", "version": "0.2" }, "last_serial": 787223, "releases": { "0.1": [ { "comment_text": "", "digests": { "md5": "e84bc25c6a4dceb36999b286c4a35596", "sha256": "f2e54244b68d23bd9b95253183b3fbfb038dadedfa9e89ff95eb76073b9891f3" }, "downloads": -1, "filename": "camxes-0.1.tar.gz", "has_sig": false, "md5_digest": "e84bc25c6a4dceb36999b286c4a35596", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 823176, "upload_time": "2011-03-28T22:41:47", "url": "https://files.pythonhosted.org/packages/31/db/5ad467cdff2192a6ebedbb13c38199be258e81203bb7847e1473805f248b/camxes-0.1.tar.gz" } ], "0.2": [ { "comment_text": "", "digests": { "md5": "15173336e9e6048fe72aec540c7744db", "sha256": "f056027fdb3f8aa942dd49ada0b33a2ea6bf285c16ae7e7d7c073f60142012e6" }, "downloads": -1, "filename": "camxes-0.2.tar.gz", "has_sig": false, "md5_digest": "15173336e9e6048fe72aec540c7744db", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 826220, "upload_time": "2011-03-31T23:54:41", "url": "https://files.pythonhosted.org/packages/6a/6c/0cfdb2632b82146ac2c44bf4ba11110109e2a8a3d75f8152ca51205fd070/camxes-0.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "15173336e9e6048fe72aec540c7744db", "sha256": "f056027fdb3f8aa942dd49ada0b33a2ea6bf285c16ae7e7d7c073f60142012e6" }, "downloads": -1, "filename": "camxes-0.2.tar.gz", "has_sig": false, "md5_digest": "15173336e9e6048fe72aec540c7744db", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 826220, "upload_time": "2011-03-31T23:54:41", "url": "https://files.pythonhosted.org/packages/6a/6c/0cfdb2632b82146ac2c44bf4ba11110109e2a8a3d75f8152ca51205fd070/camxes-0.2.tar.gz" } ] }