{ "info": { "author": "Matt Durrant", "author_email": "matthewgeorgedurrant@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "# `tidypython` - A simple python package designed to syntactically mimic the tidyr package in R. \n\n\nInstall the package with pip:\n\n pip install tidypython\n\nLoad the functions into your script with:\n\n from tidypython import *\n\nYou will then have access to the functions:\n\n gather(*args, **kwargs)\n spread(*args, **kwargs)\n separate(*args, **kwargs)\n\nThe syntax is designed to resemble that of the R package `tidyr` as closely as possible. \nAll of the functionality is not yet fully implemented, but the basics are there.\n\nAll of these functions are designed to work with the `dplython` package operators `>>`.\n\n\nThis package is currently in development, please visit the github page if you'd like to contribute: \nhttps://github.com/durrantmm/tidypython\n\n# Examples\n\nFirst, import the dplython package:\n\n from dplython import *\n\nMake sure that your dataframe is a DplyFrame object. You can make sure it is by calling:\n\n df = DplyFrame(df)\n\nOr you can read in your file directly using the `readpy` package:\n\n from readpy import *\n df = read_tsv(\"myfile.tsv\")\n\n\n### `gather()`\n\nThe `gather()` command implements the pandas `melt()` function using the `tidyr` syntax.\n\nYou can use the `gather` command as:\n\n df >> gather(X.key, X.value, X.column1, X.column2...)\n\n`X.key` and `X.value` are used to determine the new names of the key and value columns that will be created.\n\nBy default, this will use column1, column2, and all other subsequent columns to determine the keys and the values. All\nunspecified columns will be used simply as an index. Alternatively, you can use the syntax\n\n df >> gather(X.key, X.value, X.column1, X.column2, exclude=True)\n\nWhich will make column1, and column2 the index, and all other unspecified columns will be used for the key and value \ncolumns.\n\nUsing the `mtcars` data as an example:\n\n >>> mtcars = read_tsv('mtcars.tsv')\n >>> print(mtcars >> head())\n name mpg cyl disp hp drat wt qsec vs am gear carb\n 0 Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4\n 1 Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4\n 2 Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1\n 3 Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1\n 4 Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2\n\n\nLets first gather all of the columns of interest by inclusion:\n\n >>> mtcars_gathered_inclusion = mtcars >> \\\n ... gather(X.info, X.val, X.mpg, X.cyl, X.disp, X.hp, X.drat, X.wt, X.qsec, X.vs, X.am, X.gear, X.carb))\n >>> print(mtcars_gathered_inclusion >> head())\n name info val\n 0 Mazda RX4 mpg 21.0\n 1 Mazda RX4 Wag mpg 21.0\n 2 Datsun 710 mpg 22.8\n 3 Hornet 4 Drive mpg 21.4\n 4 Hornet Sportabout mpg 18.7\n >>> print(mtcars_gathered_inclusion >> tail())\n name info val\n 347 Lotus Europa carb 2.0\n 348 Ford Pantera L carb 4.0\n 349 Ferrari Dino carb 6.0\n 350 Maserati Bora carb 8.0\n 351 Volvo 142E carb 2.0\n\nNow we can do it by exclusion, which is much shorter in this case:\n\n >>> mtcars_gathered_exclusion = mtcars >> \\\n ... gather(X.info, X.val, X.name, exclude=True))\n >>> print(mtcars_gathered_exclusion >> head())\n name info val\n 0 Mazda RX4 mpg 21.0\n 1 Mazda RX4 Wag mpg 21.0\n 2 Datsun 710 mpg 22.8\n 3 Hornet 4 Drive mpg 21.4\n 4 Hornet Sportabout mpg 18.7\n >>> print(mtcars_gathered_exclusion >> tail())\n name info val\n 347 Lotus Europa carb 2.0\n 348 Ford Pantera L carb 4.0\n 349 Ferrari Dino carb 6.0\n 350 Maserati Bora carb 8.0\n 351 Volvo 142E carb 2.0\n\nYou can see that it functions very much in the same manner as the `tidyr::gather` function.\n\n### `spread()`\n\nThe `spread()` command implements the pandas `pivot()` function using the `tidyr` syntax.\n\nYou can use the `spread` command as:\n\n df >> spread(X.key, X.value)\n\n`X.key` and `X.value` are used to specify the existing columns that are pivoted, and all other unused columns are\nassumed to be the index.\n\nWhich will make column1, and column2 the index, and all other unspecified columns will be used for the key and value \ncolumns.\n\nUsing the `mtcars` data as an example:\n\n >>> mtcars = read_tsv('mtcars.tsv')\n >>> print(mtcars >> head())\n name mpg cyl disp hp drat wt qsec vs am gear carb\n 0 Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4\n 1 Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4\n 2 Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1\n 3 Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1\n 4 Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2\n\n\nLets first gather all of the columns of interest by exclusion:\n\n >>> mtcars_gathered_exclusion = mtcars >> \\\n ... gather(X.info, X.val, X.name, exclude=True))\n >>> print(mtcars_gathered_exclusion >> head())\n name info val\n 0 Mazda RX4 mpg 21.0\n 1 Mazda RX4 Wag mpg 21.0\n 2 Datsun 710 mpg 22.8\n 3 Hornet 4 Drive mpg 21.4\n 4 Hornet Sportabout mpg 18.7\n\nWe can then pivot the `info` and `val` columns out by using the `spread` function:\n\n >>> print(mtcars_gathered_exclusion >> spread(X.info, X.val) >> head())\n name mpg cyl disp hp drat wt qsec vs am gear carb\n 0 AMC Javelin 15.2 8.0 304.0 150.0 3.15 3.435 17.30 0.0 0.0 3.0 2.0\n 1 Cadillac Fleetwood 10.4 8.0 472.0 205.0 2.93 5.250 17.98 0.0 0.0 3.0 4.0\n 2 Camaro Z28 13.3 8.0 350.0 245.0 3.73 3.840 15.41 0.0 0.0 3.0 4.0\n 3 Chrysler Imperial 14.7 8.0 440.0 230.0 3.23 5.345 17.42 0.0 0.0 3.0 4.0\n 4 Datsun 710 22.8 4.0 108.0 93.0 3.85 2.320 18.61 1.0 1.0 4.0 1.0\n\n\nYou can see that it functions very much in the same manner as the `tidyr::gather` function. As currently implemented,\nthe order of the columns will be preserved in python >= 3.6, but the order of the index will not.\n\n\n### `separate()`\n\nThe `separate()` command doesn't have a direct parallel in other python packages that I am aware of.\n\nYou can use the `separate` command as:\n\n df >> separate(X.column, into, sep=myseperator)\n\n`X.column` is the column that you want to split, `into` is a list of the new column names for the split columns,\nand `sep` is a regex-expression used to split the X.column. By default, this will split by `[^\\w]+`.\n\n\nLet's say that our mtcars dataframe only included a `name`, `mpg`, and `cyl` joined in a single column by the \nseparator '|':\n\n\n >>> print(mtcars_messy >> head())\n name \n 0 Mazda RX4|21.0|6\n 1 Mazda RX4 Wag|1.0|6\n 2 Datsun 710|22.8|4\n 3 Hornet 4 Drive|21.4|6\n 4 Hornet Sportabout|18.7|8\n\nYou could seperate this column into three columns using the command:\n\n >>> mtcars_clean = mtcars_messy >> separate(X.name, ['name', 'mpg', 'cyl'], sep='\\|')\n >>> print(mtcars_clean >> head())\n name mpg cyl\n 0 Mazda RX4 21.0 6\n 1 Mazda RX4 Wag 21.0 6\n 2 Datsun 710 22.8 4\n 3 Hornet 4 Drive 21.4 6\n 4 Hornet Sportabout 18.7 8\n\nNote that all of the columns remain strings after separating. It will also not give any warnings if the number \nof columns specified does not match the number of strings after splitting.\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/durrantmm/tidypython", "keywords": "tidyr R syntax", "license": "", "maintainer": "", "maintainer_email": "", "name": "tidypython", "package_url": "https://pypi.org/project/tidypython/", "platform": "", "project_url": "https://pypi.org/project/tidypython/", "project_urls": { "Homepage": "https://github.com/durrantmm/tidypython" }, "release_url": "https://pypi.org/project/tidypython/0.0.1.dev3/", "requires_dist": [ "pandas", "dplython", "readpy" ], "requires_python": "", "summary": "A package designed to syntactically mimic the tidyr R package", "version": "0.0.1.dev3" }, "last_serial": 4902959, "releases": { "0.0.1.dev1": [ { "comment_text": "", "digests": { "md5": "fac497f6bd6bb9f5fadc82e17dae4f6b", "sha256": "8575a5010e713d908f765fedc5655aa13649d474f61fc793b646f0e4b6573be8" }, "downloads": -1, "filename": "tidypython-0.0.1.dev1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "fac497f6bd6bb9f5fadc82e17dae4f6b", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 3772, "upload_time": "2019-03-05T23:20:58", "url": "https://files.pythonhosted.org/packages/cf/97/f2a1cfba2ffb8f97cadce3bef16a49e4a4073f7218565e8157a7031ae7fd/tidypython-0.0.1.dev1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "4818ff219806d668572c5cc6a96e2e24", "sha256": "a22d0d98e2cfaf05efcd3cf6b719a65d0bea0abe0c1c7676c58f61f8e76f2a74" }, "downloads": -1, "filename": "tidypython-0.0.1.dev1.tar.gz", "has_sig": false, "md5_digest": "4818ff219806d668572c5cc6a96e2e24", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2549, "upload_time": "2019-03-05T23:21:00", "url": "https://files.pythonhosted.org/packages/bd/af/a9f93c140c9d9f6d33064b3985d752ccb5a89de0ff1219b37de2cafbe187/tidypython-0.0.1.dev1.tar.gz" } ], "0.0.1.dev2": [ { "comment_text": "", "digests": { "md5": "91f3ff51c8e052cf525dd4956428f3a3", "sha256": "d865eb252a31cd6b16770102cc2bd2824b0e2c42734c7856cea0956f21375dde" }, "downloads": -1, "filename": "tidypython-0.0.1.dev2-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "91f3ff51c8e052cf525dd4956428f3a3", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 3866, "upload_time": "2019-03-05T23:47:54", "url": "https://files.pythonhosted.org/packages/1b/f0/360d39d14761d51df9cd1b98aa9fa9ca4ce62b26fa81060094507f615971/tidypython-0.0.1.dev2-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "08a38e86f458806c682bc3910122b413", "sha256": "7fd1c2660c5d1e70bb29ae5bbaf458290b78856976a0bac4dd34f260e550f95e" }, "downloads": -1, "filename": "tidypython-0.0.1.dev2.tar.gz", "has_sig": false, "md5_digest": "08a38e86f458806c682bc3910122b413", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2648, "upload_time": "2019-03-05T23:47:55", "url": "https://files.pythonhosted.org/packages/90/c0/193bca8752c0f8394929208e9e3f5836f92b9e9022b6688104ce32eea242/tidypython-0.0.1.dev2.tar.gz" } ], "0.0.1.dev3": [ { "comment_text": "", "digests": { "md5": "5d5b4a7befb9bd6708a208eaa12d8b0f", "sha256": "7a72657dc991a99c42fddd2439705b920abeb7e98958300022e9446d52655d5c" }, "downloads": -1, "filename": "tidypython-0.0.1.dev3-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "5d5b4a7befb9bd6708a208eaa12d8b0f", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 6167, "upload_time": "2019-03-06T01:18:41", "url": "https://files.pythonhosted.org/packages/9e/c8/92e68c5c9e3c79dde038660bf2b91f3d445c77879bc7f76df0dabea7c789/tidypython-0.0.1.dev3-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "a814e311960367567f71dcd80b98691f", "sha256": "3fcdfd5469a51cc06146f8c123c5c443a767da607e76ad128f4ce29cb9dd01af" }, "downloads": -1, "filename": "tidypython-0.0.1.dev3.tar.gz", "has_sig": false, "md5_digest": "a814e311960367567f71dcd80b98691f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5156, "upload_time": "2019-03-06T01:18:42", "url": "https://files.pythonhosted.org/packages/8a/62/e591d9f7f0b3878f120945c16d2b7c92b316fbdec8924b99b93c7a3ddccc/tidypython-0.0.1.dev3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "5d5b4a7befb9bd6708a208eaa12d8b0f", "sha256": "7a72657dc991a99c42fddd2439705b920abeb7e98958300022e9446d52655d5c" }, "downloads": -1, "filename": "tidypython-0.0.1.dev3-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "5d5b4a7befb9bd6708a208eaa12d8b0f", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 6167, "upload_time": "2019-03-06T01:18:41", "url": "https://files.pythonhosted.org/packages/9e/c8/92e68c5c9e3c79dde038660bf2b91f3d445c77879bc7f76df0dabea7c789/tidypython-0.0.1.dev3-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "a814e311960367567f71dcd80b98691f", "sha256": "3fcdfd5469a51cc06146f8c123c5c443a767da607e76ad128f4ce29cb9dd01af" }, "downloads": -1, "filename": "tidypython-0.0.1.dev3.tar.gz", "has_sig": false, "md5_digest": "a814e311960367567f71dcd80b98691f", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5156, "upload_time": "2019-03-06T01:18:42", "url": "https://files.pythonhosted.org/packages/8a/62/e591d9f7f0b3878f120945c16d2b7c92b316fbdec8924b99b93c7a3ddccc/tidypython-0.0.1.dev3.tar.gz" } ] }