{ "info": { "author": "Databases & Information Systems Group, TU Ilmenau", "author_email": "stefan.hagedorn@tu-ilmenau.de", "bugtrack_url": null, "classifiers": [ "Operating System :: OS Independent", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7" ], "description": "[![Testing](https://dbgit.prakinf.tu-ilmenau.de/code/grizzly/badges/master/pipeline.svg)](https://dbgit.prakinf.tu-ilmenau.de/code/grizzly/commits/master)\n[![coverage report](https://dbgit.prakinf.tu-ilmenau.de/code/grizzly/badges/master/coverage.svg)](https://dbgit.prakinf.tu-ilmenau.de/code/grizzly/commits/master)\n\n# Grizzly\n\nGrizzly is a transpiler from a Python-API to SQL to move computations from the client into a database system.\n\nGrizzly implements its own `DataFrame` structure that tracks operations, like projection, filter, joins, ...\nOnly when the result of the sequence of operations is needed, a SQL string is produced, resembling all those operations, and sent to a DBMS.\nThis way, you don't have to care about Out-of-Memory problems, un-optimized queries, and high CPU load.\n\n## Installation\nGrizzly is available on PyPi: https://pypi.org/project/grizzly-sql/0.1/\n```\npip3 install --user grizzly-sql\n```\n\n## Dependencies\nGrizzly uses\n - Python 3\n - [SQLite3](https://docs.python.org/2/library/sqlite3.html) (currently for tests only)\n - [BeautifulTable](https://github.com/pri22296/beautifultable) for pretty output\n\n\n## Getting started\n\n### Connection\nConnect to your database using an appropriate connection string:\n```python\nimport sqlite3\ncon = sqlite3.connect(\"grizzly.db\")\n```\nNow, reference the table(s) you want to work with:\n```python\ndf = grizzly.read_table(\"events\", con)\n```\n\nThe connection can also be set globally in the `connection.Connection.db` field.\n\nHere, `df` is just a reference, it contains no data from your table.\nTo show its contents, use the `show` method:\n```python\ndf.show()\n```\nThis will print the table's content on the screen. \n\n### Filter & Projection\nOperations are similar to Pandas:\n```python\ndf[df[\"id\" == 42]] # filter\ndf = df[[\"actor1\",\"actor2\"]]\n```\n### Joins\n\nA `DataFrame` can be joined with another `DataFrame`:\n```python\ndf1 = grizzly.read_table(\"table1\")\ndf2 = grizzly.read_table(\"table2\")\n\njoined = df1.join(df2, on=[\"joinCol1\", \"joinCol2\"], how=\"inner\", comp='=')\n```\nIn the `on` parameter, you specify the join columns. The first one is for the left input (`df1`), the second one for the right input (`df2`).\nThe `how` parameter is used to select the join type: `inner`, `left outer`, etc. This value is directly placed into the generated query, and thus depends on \nthe dialect of the underlying DBMS. An additional `comp` parameter lets you choose the comparison operator.\n\nYou sometimes want to join on multiple columns with different comparisons. For this, in Grizzly you define the expression as if it was for filters:\n```python\ndf1 = grizzly.read_table(\"t1\")\ndf2 = grizzly.read_table(\"t2\")\n\nj = df1.join(df2, on = (df1['a'] == df2['b']) & (df1['c'] <= df2['d']), how=\"left outer\")\n```\n\nThis results in the following SQL code:\n```sql \nSELECT * \nFROM t1 \n LEFT OUTER JOIN (SELECT * FROM t2 ) IOBRD \n ON (t1.a = IOBRD.b) AND (t1.c <= IOBRD.d)\n```\n### Grouping & Aggregation\n\nYou can also group the data on multiple columns and compute an aggregate over the groups:\n```python\ndf = grizzly.read_table(\"events\") \ndf = df[df['id'] == 42]\ng = df.groupby([\"year\",\"actor1\"])\n\na = g.count(\"actor2\")\n```\nIf no aggregation function is used an `show()` is called, only the grouping columns are selected.\nYou can apply aggregation functions on non-grouped `DataFrame`s of course. In this case the aggregates will be computed for the whole content.\n\nThus, `a.sql()` will give\n```sql\nSELECT year, actor1, count(actor2) \nFROM events\nWHERE id = 42\nGROUP BY year, actor1\n```\n, whereas `df.count()` (i.e. before the grouping) for the above piece of code will return the single scalar value with the number of records in `df`:\n```sql\nSELECT count(*) \nFROM events\nWHERE id = 42\n```\n\n### SQL\n\nYou can inspect the produced SQL string with `sql()`:\n```python\nprint(df.sql()) \n```\nAnd the output will be \n```sql\nSELECT actor1, actor2\nFROM events\nWHERE id = 42\n```\n\n## Supported operations\n - filter/selection\n - projection\n - join\n - group by\n - aggregation functions\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/dbis-ilm/grizzly", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "grizzly-sql", "package_url": "https://pypi.org/project/grizzly-sql/", "platform": "", "project_url": "https://pypi.org/project/grizzly-sql/", "project_urls": { "Homepage": "https://github.com/dbis-ilm/grizzly" }, "release_url": "https://pypi.org/project/grizzly-sql/0.1.1/", "requires_dist": null, "requires_python": ">=3.6", "summary": "A Python-to-SQL transpiler to work with relational databases", "version": "0.1.1" }, "last_serial": 5881858, "releases": { "0.1": [ { "comment_text": "", "digests": { "md5": "14abb13fa147a40c3b8d20c57a7348b1", "sha256": "9fedc9878c13e814e35e0b19efe4a82dc54eade75471b2e92c901baf2d22b27f" }, "downloads": -1, "filename": "grizzly_sql-0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "14abb13fa147a40c3b8d20c57a7348b1", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 3767, "upload_time": "2019-09-24T20:25:24", "url": "https://files.pythonhosted.org/packages/d3/5a/f8ce078758d80c2a57275e60d5a5c69c9cde094a215db7d24d45ac66e275/grizzly_sql-0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "8ca63586147490d5264c3d6c157568cf", "sha256": "db5f28ee1dffe8f7d4166a3b5c091653cdfaa161ca0ed566902e449f9ef6a67b" }, "downloads": -1, "filename": "grizzly-sql-0.1.tar.gz", "has_sig": false, "md5_digest": "8ca63586147490d5264c3d6c157568cf", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 3160, "upload_time": "2019-09-24T20:25:27", "url": "https://files.pythonhosted.org/packages/87/d6/3d9ccd959a53458331be6dc46c90a0127ce7d1d75eae4e106e60cb328979/grizzly-sql-0.1.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "ad0293d2273650fb0b7cb2ac17ccdbc6", "sha256": "00c52b1e2dbcb1e1233a746e83bc9735b553a23e69c5a246a685af82ea0eb187" }, "downloads": -1, "filename": "grizzly_sql-0.1.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "ad0293d2273650fb0b7cb2ac17ccdbc6", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=3.6", "size": 8991, "upload_time": "2019-09-24T21:11:46", "url": "https://files.pythonhosted.org/packages/05/51/8fff326b356a4cfe0920b752ed03bad08c4c22bfc0285ad7d2e0ebc9ed92/grizzly_sql-0.1.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3c692722bdc50ea5bd1cb01eae53ba00", "sha256": "2e6b2b9ebfe67ed8615402c9ec42a44c73c69bf6a47123f1c9c805ef4c108b93" }, "downloads": -1, "filename": "grizzly-sql-0.1.1.tar.gz", "has_sig": false, "md5_digest": "3c692722bdc50ea5bd1cb01eae53ba00", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 7030, "upload_time": "2019-09-24T21:11:49", "url": "https://files.pythonhosted.org/packages/f3/0a/0a1ea2e50c99de843947cb7e69175999f8a0ab746b307115485944627601/grizzly-sql-0.1.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "ad0293d2273650fb0b7cb2ac17ccdbc6", "sha256": "00c52b1e2dbcb1e1233a746e83bc9735b553a23e69c5a246a685af82ea0eb187" }, "downloads": -1, "filename": "grizzly_sql-0.1.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "ad0293d2273650fb0b7cb2ac17ccdbc6", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": ">=3.6", "size": 8991, "upload_time": "2019-09-24T21:11:46", "url": "https://files.pythonhosted.org/packages/05/51/8fff326b356a4cfe0920b752ed03bad08c4c22bfc0285ad7d2e0ebc9ed92/grizzly_sql-0.1.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3c692722bdc50ea5bd1cb01eae53ba00", "sha256": "2e6b2b9ebfe67ed8615402c9ec42a44c73c69bf6a47123f1c9c805ef4c108b93" }, "downloads": -1, "filename": "grizzly-sql-0.1.1.tar.gz", "has_sig": false, "md5_digest": "3c692722bdc50ea5bd1cb01eae53ba00", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 7030, "upload_time": "2019-09-24T21:11:49", "url": "https://files.pythonhosted.org/packages/f3/0a/0a1ea2e50c99de843947cb7e69175999f8a0ab746b307115485944627601/grizzly-sql-0.1.1.tar.gz" } ] }