{ "info": { "author": "Integrichain Innovation Team", "author_email": "engineering@integrichain.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.7" ], "description": "# DFMock\npandas dataframe mocker for python3. \n\n## Why? \nSometimes you just need mock data that is already in pandas datatypes. There are lots of data gen tools out there, but most produce CSVs or some such that needs to be consumed by pandas; this adds a volitile point to your tests. **What if your logic is correct, but the datatypes didn't import correctly from the CSV?** So DFMock aims to generate mock dataframes where the datatypes are controlled. \n\n## Installation\nUse pip\n\n pip install dfmock\n\nYou can also get the source [here](git@github.com:IntegriChain1/DFMock.git)\n\n## The API\nusing is simple. \nimport into your module, select your column names and data types, set the number of columns, and generate your frame. \n from dfmock import DFMock\n\n columns = { \"hamburger\":\"string\",\n \"hot_dog\":\"integer\",\n \"shoelace\":\"timedelta\"\n }\n dfmock = DFMock(count=100, cols=columns)\n dfmock.generate_dataframe()\n\n my_mocked_dataframe = dfmock.dataframe\n\nPandas data types supported:\n\n| **PANDAS TYPE** | **DICT VALUE** |\n| :-------------- | -------------: |\n| object | string |\n| int64 | int |\n| float64 | float |\n| bool | bool |\n| datetime64 | datetime |\n| timedelta | timedelta |\n| category | category |\n\nso to make a dataframe with column \"banana\" that is an `int64` and \"rockstar\" that is a `category` type:\n\n dfmock.cols = {\"banana\":\"int\",\"rockstar\":\"category\"}\n\npretty simple.\n\n## Grow to Size\nsometimes you need your dataset to be a certain memory size instead of row size. The `grow_dataframe_to_size()` allows you to grow a frame until it reaches or passes the given size (in MB). \nNeed a 10MB dataframe? \n\n dfmock.generate_dataframe()\n dfmock.size\n ## returns 0.2 MB\n\n dfmock.grow_dataframe_to_size(10)\n dfmock.size\n ## returns ~10 MB\n\n## Grouping \n**NOTE**: `timedelta` and `category` datatypes are not currently supported. \nSometimes you need a column you can aggregate on with a given data type. For example, you may want 1M rows with one of 4 datetime values (maybe representing 4 observation reporting timestamps?). You can do this using **grouping**.\na grouped column is declared by passing a dict as the data type value with the params for the grouped column as keys. Like this: \n\n columns = {\"amazing_grouped_column_with_3_values\": { \"option_count\":3, \"option_type\":\"string\"}}\n\nThis will give you a column with only 3 distinct values and (nearly) equal distribution. \nIf you need to control the distribution you can pass the histogram argument. This is useful if, for instance, you want a dataset with 4 datetime values and want one value for 50% of records, another for 30%, and the remaining values 20%. You would delcare this distribution like so: \n\n columns = {\"super_cool_grouped_column_with_histogram\": {\"option_count\":4, \"option_type\":\"datetime\",\"histogram\":(5,3,2,2,)}}\n\nThe integers you pass histogram need to add up to 10 with the same number of values as option count (ie. if `option_count` = 5, you need 5 values). This caps option count at 10 currently for histogram. \nYour desired row count must also be divisible by 10 - this keeps the math simple. \n\n\n## Contribution\nWe welcome pull requests! \n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/integrichain1/dfmock", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "dfmock", "package_url": "https://pypi.org/project/dfmock/", "platform": "", "project_url": "https://pypi.org/project/dfmock/", "project_urls": { "Homepage": "https://github.com/integrichain1/dfmock" }, "release_url": "https://pypi.org/project/dfmock/0.0.14/", "requires_dist": [ "pandas" ], "requires_python": "", "summary": "utility for generating mock data sets as pandas dataframes", "version": "0.0.14" }, "last_serial": 4936196, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "d2bfc166bf9ac8bc0883cf59ea03c715", "sha256": "90fd6a1ddc3b38271802fae29ad23dd6ab21006fa088be498def5859455131e4" }, "downloads": -1, "filename": "dfmock-0.0.1-py3.7.egg", "has_sig": false, "md5_digest": "d2bfc166bf9ac8bc0883cf59ea03c715", "packagetype": "bdist_egg", "python_version": "3.7", "requires_python": null, "size": 7081, "upload_time": "2019-02-20T22:55:13", "url": "https://files.pythonhosted.org/packages/79/13/0ad68aca4bbc8fd0dfca1ec4fb24b8cf0332f26c4610d096052b42bb35b6/dfmock-0.0.1-py3.7.egg" }, { "comment_text": "", "digests": { "md5": "a80465e4e9362f97b3348248a78b7d38", "sha256": "c54223a775999cc629b1308cc16c7b80e1d617b5110f821d9dba4c1a19d4dc5e" }, "downloads": -1, "filename": "dfmock-0.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "a80465e4e9362f97b3348248a78b7d38", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 4126, "upload_time": "2019-02-20T22:55:11", "url": "https://files.pythonhosted.org/packages/62/af/6c5de1dac87bd72a34b55c223045de89702d51ce959ffd58a4a6d5b7b8b0/dfmock-0.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "f0c0546a07a7673e5161d3a12a475596", "sha256": "7f0388b30206cf409ed9a16e8f9ae6665d702a4507752842d3062fed6bcbf549" }, "downloads": -1, "filename": "dfmock-0.0.1.tar.gz", "has_sig": false, "md5_digest": "f0c0546a07a7673e5161d3a12a475596", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3547, "upload_time": "2019-02-20T22:55:14", "url": "https://files.pythonhosted.org/packages/7f/08/c1e655cf72c18890df1761065ba4a1b9a1de418908fea48604234e9af59b/dfmock-0.0.1.tar.gz" } ], "0.0.11": [ { "comment_text": "", "digests": { "md5": "81c08c23655a00280a930be24ffa6d96", "sha256": "569f8e47c76e4dcb4d2388abfc4cd2dffe6fad8c78615ef340a23acbdd199a24" }, "downloads": -1, "filename": "dfmock-0.0.11-py3-none-any.whl", "has_sig": false, "md5_digest": "81c08c23655a00280a930be24ffa6d96", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 4145, "upload_time": "2019-02-20T23:02:21", "url": "https://files.pythonhosted.org/packages/38/0b/254a60c1fcddacd4bee3937080c694550a2121c183b34fdf0ec8459efae8/dfmock-0.0.11-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "a9094a7e418edbf5222ee5095dc33f78", "sha256": "267a204fd760dd35d73865eff2dbcb192e46d08314ea942cbe43a095c21e6bf0" }, "downloads": -1, "filename": "dfmock-0.0.11.tar.gz", "has_sig": false, "md5_digest": "a9094a7e418edbf5222ee5095dc33f78", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 3542, "upload_time": "2019-02-20T23:02:24", "url": "https://files.pythonhosted.org/packages/b2/c9/ba0861e034f7737540a2b1f01983206857793bed54c77c6cd40484c53ffc/dfmock-0.0.11.tar.gz" } ], "0.0.12": [ { "comment_text": "", "digests": { "md5": "5a1499044e550db38b36527ac8efe830", "sha256": "e73035a44762936ef02930c112dea81117df4134c31b50ffa7a7264e76f9ca26" }, "downloads": -1, "filename": "dfmock-0.0.12-py3-none-any.whl", "has_sig": false, "md5_digest": "5a1499044e550db38b36527ac8efe830", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 5311, "upload_time": "2019-02-28T14:43:46", "url": "https://files.pythonhosted.org/packages/a6/55/34fefc95b352f9b3db0cdc100c1dcbd64dcaa429710a0a54db35b99ee167/dfmock-0.0.12-py3-none-any.whl" } ], "0.0.13": [ { "comment_text": "", "digests": { "md5": "d0da0e4b042b8ccd2405b792d8042958", "sha256": "0bedd7a098b346a4d52421ff9be2f931b4f46c410e53f3a365d392aeca11c383" }, "downloads": -1, "filename": "dfmock-0.0.13-py3-none-any.whl", "has_sig": false, "md5_digest": "d0da0e4b042b8ccd2405b792d8042958", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 5424, "upload_time": "2019-02-28T17:26:17", "url": "https://files.pythonhosted.org/packages/cb/03/ef4f94267e8ed2a7413bac3b0bf95d59af20fb19ceb7c4765d24a6387333/dfmock-0.0.13-py3-none-any.whl" } ], "0.0.14": [ { "comment_text": "", "digests": { "md5": "22c8ce43d05658518afe619217c19f36", "sha256": "739924ca6bfca5afe66b4ceb3e2b46ef0e54ec308d11e0d02218726031701be3" }, "downloads": -1, "filename": "dfmock-0.0.14-py3-none-any.whl", "has_sig": false, "md5_digest": "22c8ce43d05658518afe619217c19f36", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 5659, "upload_time": "2019-03-13T19:46:25", "url": "https://files.pythonhosted.org/packages/c6/4f/2e1b7f9677d8cb79dede471c244db67613bbbd9210a9db0bd7d45eacad94/dfmock-0.0.14-py3-none-any.whl" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "22c8ce43d05658518afe619217c19f36", "sha256": "739924ca6bfca5afe66b4ceb3e2b46ef0e54ec308d11e0d02218726031701be3" }, "downloads": -1, "filename": "dfmock-0.0.14-py3-none-any.whl", "has_sig": false, "md5_digest": "22c8ce43d05658518afe619217c19f36", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 5659, "upload_time": "2019-03-13T19:46:25", "url": "https://files.pythonhosted.org/packages/c6/4f/2e1b7f9677d8cb79dede471c244db67613bbbd9210a9db0bd7d45eacad94/dfmock-0.0.14-py3-none-any.whl" } ] }