{ "info": { "author": "Reid 'arrdem' McKenzie", "author_email": "me@arrdem.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7", "Topic :: Database", "Topic :: Database :: Database Engines/Servers", "Topic :: Database :: Front-Ends" ], "description": "# Datalog.Shell\n\nA shell for my Datalog engine.\n\n## What is Datalog?\n\n[Datalog](https://en.wikipedia.org/wiki/Datalog) is a fully\ndeclarative language for expressing relational data and queries,\ntypically written using a syntactic subset of Prolog. Its most\ninteresting feature compared to other relational languages such as SQL\nis that it features production rules.\n\nBriefly, a datalog database consists of rules and tuples. Tuples are\nwritten `a(b, \"c\", 126, ...).`, require no declaration eg. of table,\nmay be of arbitrary even varying length. The elements of this tuple\nare strings which may be written as bare words or quoted.\n\nIn the interpreter (or a file), we could define a small graph as such -\n\n```\n$ datalog\n>>> edge(a, b).\n\u21d2 edge('a', 'b')\n>>> edge(b, c).\n\u21d2 edge('b', 'c')\n>>> edge(c, d).\n\u21d2 edge('c', 'd')\n```\n\nBut how can we query this? We can issue queries by entering a tuple\nterminated with `?` instead of `.`.\n\nFor instance we could query if some tuples exist in the database -\n\n```\n>>> edge(a, b)?\n\u21d2 edge('a', 'b')\n>>> edge(d, f)?\n\u21d2 \u00d8\n>>> \n```\n\nWe did define `edge(a, b).` so our query returns that tuple. However\nthe tuple `edge(d, f).` was not defined, so our query produces no\nresults. Rather than printing nothing, the `\u00d8` symbol which denotes\nthe empty set is printed for clarity.\n\nThis is correct, but uninteresting. How can we find say all the edges\nfrom `a`? We don't have a construct like wildcards with which to match\nanything - yet.\n\nEnter logic variables. Logic variables are capitalized words, `X`,\n`Foo` and the like, which are interpreted as wildcards by the query\nengine. Capitalized words are always understood as logic variables.\n\n```\n>>> edge(a, X)?\n\u21d2 edge('a', 'b')\n```\n\nHowever unlike wildcards which simply match anything, logic variables\nare unified within a query. Were we to write `edge(X, X)?` we would be\nasking for the set of tuples such that both elements of the `edge`\ntuple equate.\n\n```\n>>> edge(X, X)?\n\u21d2 \u00d8\n```\n\nOf which we have none.\n\nBut what if we wanted to find paths between edges? Say to check if a\npath existed from `a` to `d`. We'd need to find a way to unify many\nlogic variables together - and so far we've only seen queries of a\nsingle tuple.\n\nEnter rules. We can define productions by which the Datalog engine can\nproduce new tuples. Rules are written as a tuple \"pattern\" which may\ncontain constants or logic variables, followed by a sequence of\n\"clauses\" separated by the `:-` assignment operator.\n\nRules are perhaps best understood as subqueries. A rule defines an\nindefinite set of tuples such that over that set, the query clauses\nare simultaneously satisfied. This is how we achieve complex queries.\n\nThere is no alternation - or - operator within a rule's body. However,\nrules can share the same tuple \"pattern\".\n\nSo if we wanted to say find paths between edges in our database, we\ncould do so using two rules. One which defines a \"simple\" path, and\none which defines a path from `X` to `Y` recursively by querying for\nan edge from `X` to an unconstrained `Z`, and then unifying that with\n`path(Z, Y)`.\n\n```\n>>> path(X, Y) :- edge(X, Y).\n\u21d2 path('X', 'Y') :- edge('X', 'Y').\n>>> path(X, Y) :- edge(X, Z), path(Z, Y).\n\u21d2 path('X', 'Y') :- edge('X', 'Z'), path('Z', 'Y').\n>>> path(a, X)?\n\u21d2 path('a', 'b')\n\u21d2 path('a', 'c')\n\u21d2 path('a', 'd')\n```\n\nWe could also ask for all paths -\n\n```\n>>> path(X, Y)?\n\u21d2 path('b', 'c')\n\u21d2 path('a', 'b')\n\u21d2 path('c', 'd')\n\u21d2 path('b', 'd')\n\u21d2 path('a', 'c')\n\u21d2 path('a', 'd')\n```\n\nDatalog also supports negation. Within a rule, a tuple prefixed with\n`~` becomes a negative statement. This allows us to express \"does not\nexist\" relations, or antjoins. Note that this is only possible by\nmaking the [closed world assumption](https://en.wikipedia.org/wiki/Closed-world_assumption).\n\nDatalog also supports binary equality as a special relation. `=(X,Y)?`\nis a nonsense query alone because the space of `X` and `Y` are\nundefined. However within a rule body, equality (and negated\nequality statements!) can be quite useful.\n\nFor convenience, the Datalog interpreter supports \"retracting\"\n(deletion) of tuples and rules. `edge(a, b)!` would retract that\nconstant tuple, but we cannot retract `path(a, b)!` as that tuple is\ngenerated by a rule. We can however retract the rule - `edge(X, Y)!`\nwhich would remove both edge production rules from the database.\n\nThe Datalog interpreter also supports reading tuples (and rules) from\none or more files, each specified by the `--db ` command\nline argument.\n\n## Usage\n\n`pip install --user arrdem.datalog.shell`\n\nThis will install the `datalog` interpreter into your user-local\npython `bin` directory, and pull down the core `arrdem.datalog` engine\nas well.\n\n## Status\n\nThis is a complete to my knowledge implementation of a traditional datalog.\n\nSupport is included for binary `=` as builtin relation, and for negated terms in\nrules (prefixed with `~`)\n\nRules, and the recursive evaluation of rules is supported with some guards to\nprevent infinite recursion.\n\nThe interactive interpreter supports definitions (terms ending in `.`),\nretractions (terms ending in `!`) and queries (terms ending in `?`), see the\ninterpreter's `help` response for more details.\n\n### Limitations\n\nRecursion may have some completeness bugs. I have not yet encountered any, but I\nalso don't have a strong proof of correctness for the recursive evaluation of\nrules yet.\n\nThe current implementation of negated clauses CANNOT propagate positive\ninformation. This means that negated clauses can only be used in conjunction\nwith positive clauses. It's not clear if this is an essential limitation.\n\nThere is as of yet no query planner - not even segmenting rules and tuples by\nrelation to restrict evaluation. This means that the complexity of a query is\n`O(dataset * term count)`, which is clearly less than ideal.\n\n## License\n\nMirrored from https://git.arrdem.com/arrdem/datalog-py\n\nPublished under the MIT license. See [LICENSE.md](LICENSE.md)", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://git.arrdem.com/arrdem/datalog-shell", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "arrdem.datalog.shell", "package_url": "https://pypi.org/project/arrdem.datalog.shell/", "platform": "", "project_url": "https://pypi.org/project/arrdem.datalog.shell/", "project_urls": { "Homepage": "https://git.arrdem.com/arrdem/datalog-shell" }, "release_url": "https://pypi.org/project/arrdem.datalog.shell/0.0.2/", "requires_dist": null, "requires_python": "", "summary": "A shell for my datalog engine", "version": "0.0.2" }, "last_serial": 5416605, "releases": { "0.0.0": [ { "comment_text": "", "digests": { "md5": "0abf6e8de8a8ce5d3d33fcbbdbff1095", "sha256": "54b7f53e06a5d6d434506b1dd4aea61a7b2a1c35709d64c129a5c8b7043e4064" }, "downloads": -1, "filename": "arrdem.datalog.shell-0.0.0.tar.gz", "has_sig": false, "md5_digest": "0abf6e8de8a8ce5d3d33fcbbdbff1095", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6759, "upload_time": "2019-06-16T19:20:00", "url": "https://files.pythonhosted.org/packages/8f/6f/9a39d65b23d232ca43ae4b59a6719037d117fdb36b12af341e9a1523fbdf/arrdem.datalog.shell-0.0.0.tar.gz" } ], "0.0.1": [ { "comment_text": "", "digests": { "md5": "7cf6afcda26b1ef9a496a4b410610c15", "sha256": "aec188844194ba2e174c03ad96bd3c221e1ebd709f0cd59d8c883ae4571aa7a9" }, "downloads": -1, "filename": "arrdem.datalog.shell-0.0.1.tar.gz", "has_sig": false, "md5_digest": "7cf6afcda26b1ef9a496a4b410610c15", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7104, "upload_time": "2019-06-18T08:01:56", "url": "https://files.pythonhosted.org/packages/8b/c0/8cee933f9c233b457e6397b82314e426f9a1c390eb28bb1f9a1b84dcfb77/arrdem.datalog.shell-0.0.1.tar.gz" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "c397643a58f37d2056d116524a6a308e", "sha256": "d8a1574432e116d584f01ca25b632d669985bd56000af687b5fc91b2d694e58c" }, "downloads": -1, "filename": "arrdem.datalog.shell-0.0.2.tar.gz", "has_sig": false, "md5_digest": "c397643a58f37d2056d116524a6a308e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7239, "upload_time": "2019-06-18T18:31:42", "url": "https://files.pythonhosted.org/packages/c9/5a/ff434e103b445060b3a38997c861e7f895bb6e1d2264f122a196cdae979d/arrdem.datalog.shell-0.0.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "c397643a58f37d2056d116524a6a308e", "sha256": "d8a1574432e116d584f01ca25b632d669985bd56000af687b5fc91b2d694e58c" }, "downloads": -1, "filename": "arrdem.datalog.shell-0.0.2.tar.gz", "has_sig": false, "md5_digest": "c397643a58f37d2056d116524a6a308e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7239, "upload_time": "2019-06-18T18:31:42", "url": "https://files.pythonhosted.org/packages/c9/5a/ff434e103b445060b3a38997c861e7f895bb6e1d2264f122a196cdae979d/arrdem.datalog.shell-0.0.2.tar.gz" } ] }