{ "info": { "author": "Alex Broley", "author_email": "alex.broley@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "# Datamaker\n\nRapidly build tabular datasets in the following formats:\n- SQL Tables (Generate code for SQL CREATE and INSERT queries)\n- CSV \n- JSON\n\nSupports: PostgreSQL, MySQL, MS SQL, SQLite\n\n## Getting Started\nDatamaker uses core python 3 and has been extensively tested on python 3.6+. \n\n### Prerequisites\nPython 3.6+\n\n### Installing\n```pip install datamaker```\n\n### Build a dataset\nIn this example we build a SQL table with 50 rows and two columns: id, name. The create and \ninsert queries will be written to the file example.sql.\n```\nfrom datamaker.SQLTableGenerator import SQLTableGen\n\n\ntable_name = \"Example\"\nnum_rows = 50\ncolumn_defs = {\n \"id\": {\"col_type\": \"int\", \"is_primary_key\": True},\n \"name\": {\"col_type\": \"varchar\", \"is_name\": True}\n}\noutput_path = \"example.sql\"\n\ngen = SQLTableGen(table_name, num_rows, column_defs, output_path)\n\ngen.build_sql_doc()\n```\n\n## Documentation\n### Input parameters\n- **table_name** - This is the name of the table to be created\n- **num_rows** - The number of rows to insert into the table\n- **column_defs** - A dictionary object with key value pairs.\n- **path** - The path to the directory and file name where you want to export the sql\nqueries or dataset for the specified format.\n\n**column_defs** is a dictionary object of the form:\n```\n{column_name: {\"col_type\": type, ...}}\n```\nNote that col_type is always specified for each column. The definitions of the parameters for **colum_defs** is given \nbelow.\n\n- col_type: int, float, varchar, bool\n- varchar_length: int (required for MySQL and MS SQL)\n- is_primary_key: bool - True if this field is the primary key\n- is_name: bool - True if values for this column should come from the names collection\n- is_email: bool - True if this column is an email. Note that you must have at least one column with is_name = True \nto use is_email.\n- is_date: bool - True if this column is a date. Default format is YYYY-mm-dd\n- is_timestamp: bool - True if this column is a timestamp. Default format is YYYY-mm-dd HH:MM:SS\n- is_country: bool - True of this column is a country code. Codes are ISO Alpha-2.\n- is_state: bool - True if this column is a US state abbreviation. \n- is_address: bool - True if this column is a street address.\n- bounds: tuple(lowerbnd, upperbnd) - Forces lowerbnd <= column values <= upperbnd\n- fixed_collection: tuple(elem1, elem2, ..., elemn) - Forces column values to be elements of this tuple\n- fixed_map: tuple(column_name, {key: value}) - Forces values for this column to be associated to the value of another \ncolumn\n- is_nullable: bool - True if this column can have null values\n- proportion_null: float - The approximate proportion of expected null values in this column. Note that is_nullable \nmust be true to use proportion_null\n\nNote that using is_primary_key assumes the column you're defining is an integer column starting at 1 and incrementing\nby 1 for each row.\n\nIf you're planning on running queries on a MySQL or MS SQL Server machine then you\nneed to specify the length of each varchar field using the varchar_length parameter.\n\n## CSV and JSON generators\nDatamaker can also be used to generate CSV and JSON files.\n\nTo create a CSV file add the following import statement to your script.\n\n```from datamaker.CSVGenerator import CSVGen```\n\nAnd to create a JSON file add the following import to your script.\n\n```from datamaker.JSONGenerator import JSONGen```\n\n\nThe input parameters are no different for CSVGen and JSONGen are the same as given above. If we want to create the \n**Example** table given in the example code above as a CSV or JSON document we only to change the import statement and \ninstantiate the CSVGen or JSONGen object using the same column_defs dictionary. Then call the appropriate build method\nfor the given class. The CSVGen class uses ```build_csv_doc``` and the JSONGen class uses ```build_json_doc```.\n\n```\nfrom datamaker.CSVGenerator import CSVGen\n\n\ntable_name = \"Example\"\nnum_rows = 50\ncolumn_defs = {\n \"id\": {\"col_type\": \"int\", \"is_primary_key\": True},\n \"name\": {\"col_type\": \"varchar\", \"is_name\": True}\n}\noutput_path = \"example.csv\"\n\ngen = CSVGen(table_name, num_rows, column_defs, output_path)\n\ngen.build_csv_doc()\n```\n\nHere's the same **Example** table built as a JSON document.\n\n```\nfrom datamaker.JSONGenerator import JSONGen\n\n\ntable_name = \"Example\"\nnum_rows = 50\ncolumn_defs = {\n \"id\": {\"col_type\": \"int\", \"is_primary_key\": True},\n \"name\": {\"col_type\": \"varchar\", \"is_name\": True}\n}\noutput_path = \"example.json\"\n\ngen = JSONGen(table_name, num_rows, column_defs, output_path)\n\ngen.build_json_doc()\n```\n\n## A more complex example\nIn this example we're generating SQL code to create a MySQL table which tracks orders placed for an imaginary online \nstore. We'll create a table with the following columns:\n- id - the primary key for this table\n- name - the name associated to the user who placed this order\n- order_ts - the timestamp associated with this order\n- product_id - the id for the product ordered\n- price - the cost of the product being ordered. Since we want every row with a given product_id to have the same cost we \nuse a fixed map to implement this column \n\nNote that in this example we're defining the output path as a valid directory instead of a path to a file. In this \ncase datamaker will automatically generate the name of the file as: \n\n**table_name**.**current_date**.sql\n\nWhere **table_name** is the name of the table and **current_date** is today's date.\n\n```\nfrom datamaker.SQLTableGenerator import SQLTableGen\n\n\ntable_name = \"Orders\"\nnum_rows = 500\ncolumn_defs = {\n \"id\": {\"col_type\": \"int\", \"is_primary_key\": True},\n \"name\": {\"col_type\": \"varchar\", \"varchar_length\": 50, \"is_name\": True},\n \"order_ts\": {\"col_type\": \"varchar\", \"varchar_length\": 26, \"is_timestamp\": True},\n \"product_id\": {\"col_type\": \"int\", \"bounds\": (1, 5)},\n \"price\": {\"col_type\": \"float\", \"fixed_map\": (\"product_id\", {1: 5.99, 2: 10.99, 3: 0.99, 4: 99.99, 5: 20})},\n}\noutput_path = \".\"\n\ngen = SQLTableGen(table_name, num_rows, column_defs, output_path)\n\ngen.build_sql_doc()\n```\n\n### Future improvements:\n- Add support for sequential fields\n- Add support for column type inference\n- Add support for tsv generation\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/tekkabroley/DataMaker", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "datamaker", "package_url": "https://pypi.org/project/datamaker/", "platform": "", "project_url": "https://pypi.org/project/datamaker/", "project_urls": { "Homepage": "https://github.com/tekkabroley/DataMaker" }, "release_url": "https://pypi.org/project/datamaker/1.0.1/", "requires_dist": null, "requires_python": ">=3.6", "summary": "A package for generating tabluar datasets", "version": "1.0.1", "yanked": false, "yanked_reason": null }, "last_serial": 8334227, "releases": { "0.1.1": [ { "comment_text": "", "digests": { "md5": "42ac7e20e30de6308fe8e9c9caf69cfc", "sha256": "27e71c5503fe6d1e6deca53adc04392e1df2e12530d2cfd3295e95c9e11c1f18" }, "downloads": -1, "filename": "datamaker-0.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "42ac7e20e30de6308fe8e9c9caf69cfc", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 22578, "upload_time": "2019-10-26T19:55:24", "upload_time_iso_8601": "2019-10-26T19:55:24.307603Z", "url": "https://files.pythonhosted.org/packages/5d/1e/91fa1b091b2f9a8855c1c9deedf64d7f9825a03d20df562f48ea1a821cea/datamaker-0.1.1-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "1ffd962b47afc52a08856dad5e9c3dfd", "sha256": "a9a30b6351df0175b66e4fd8adb0861874a76b553d7017540a9fcb5551a07044" }, "downloads": -1, "filename": "datamaker-0.1.1.tar.gz", "has_sig": false, "md5_digest": "1ffd962b47afc52a08856dad5e9c3dfd", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 13811, "upload_time": "2019-10-26T19:55:26", "upload_time_iso_8601": "2019-10-26T19:55:26.286065Z", "url": "https://files.pythonhosted.org/packages/ce/f4/d0dd10b0f0ddd246cb714a5e473fbd2a5da3ea2dd48bacc429a48445d12c/datamaker-0.1.1.tar.gz", "yanked": false, "yanked_reason": null } ], "0.1.2": [ { "comment_text": "", "digests": { "md5": "4cee3945937414ecec69424d026d11fa", "sha256": "651fdba9b64e3e3f16fe481aabb3232953ea96675f1af300bc52959f67a440c4" }, "downloads": -1, "filename": "datamaker-0.1.2-py3-none-any.whl", "has_sig": false, "md5_digest": "4cee3945937414ecec69424d026d11fa", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 23012, "upload_time": "2019-11-04T04:44:32", "upload_time_iso_8601": "2019-11-04T04:44:32.945383Z", "url": "https://files.pythonhosted.org/packages/93/d6/9cb3cb1f50d0f8cf50393bd3c422a5ea6d87cfc23706af38bb3eb257c467/datamaker-0.1.2-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "61bad5ac2d7501dd35b265cf7f8bae37", "sha256": "40b72ec715952690a3322159f68fb7abaf20627394fbcd6598fc683d3940c568" }, "downloads": -1, "filename": "datamaker-0.1.2.tar.gz", "has_sig": false, "md5_digest": "61bad5ac2d7501dd35b265cf7f8bae37", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 14706, "upload_time": "2019-11-04T04:44:34", "upload_time_iso_8601": "2019-11-04T04:44:34.125299Z", "url": "https://files.pythonhosted.org/packages/37/f5/85c3bf04107df68ed1200a2158a30129bfa0c16d58c001c702e86aa093bb/datamaker-0.1.2.tar.gz", "yanked": false, "yanked_reason": null } ], "0.1.3": [ { "comment_text": "", "digests": { "md5": "45caae19bf0c3333ef09d036b91de4ec", "sha256": "bc1352fe6dd1bd6e3b48d01626654e883968c4cd6eabc31ebea6daf41d104bbe" }, "downloads": -1, "filename": "datamaker-0.1.3-py3-none-any.whl", "has_sig": false, "md5_digest": "45caae19bf0c3333ef09d036b91de4ec", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 23097, "upload_time": "2019-11-05T03:44:00", "upload_time_iso_8601": "2019-11-05T03:44:00.327847Z", "url": "https://files.pythonhosted.org/packages/32/18/053e4991d2721d280f7c88b4d3688377696e39a8a8803488ce394e19da5f/datamaker-0.1.3-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "55e5c59ebd1e5eb8c239e2d613ceabb9", "sha256": "7abd8e1879fe95d0c1e926e981876f7269b79b48bd4f5c8ba5c887e9244ef601" }, "downloads": -1, "filename": "datamaker-0.1.3.tar.gz", "has_sig": false, "md5_digest": "55e5c59ebd1e5eb8c239e2d613ceabb9", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 14806, "upload_time": "2019-11-05T03:44:01", "upload_time_iso_8601": "2019-11-05T03:44:01.831534Z", "url": "https://files.pythonhosted.org/packages/aa/ad/bdb85a1158ac693ca8d2367e680af6243d152a854185ce2cf150de604300/datamaker-0.1.3.tar.gz", "yanked": false, "yanked_reason": null } ], "0.1.4": [ { "comment_text": "", "digests": { "md5": "bdea91d5ee79649991dca624d5b11571", "sha256": "023c932a4e098d9ef7150b19afcfe61bb8ab0d648fddb5bcf7659f7c1c90a871" }, "downloads": -1, "filename": "datamaker-0.1.4-py3-none-any.whl", "has_sig": false, "md5_digest": "bdea91d5ee79649991dca624d5b11571", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 23870, "upload_time": "2019-11-12T20:43:52", "upload_time_iso_8601": "2019-11-12T20:43:52.493419Z", "url": "https://files.pythonhosted.org/packages/7d/b1/6d851805492c64a02ab130fd5ffa8906df61caad3ab91dd5d05a785230ab/datamaker-0.1.4-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "ddb028fac927f83c0cf0068d9a2ca31d", "sha256": "054474c370e18361cd901b727fc769c2666d4d5345424766b58abb89117018c7" }, "downloads": -1, "filename": "datamaker-0.1.4.tar.gz", "has_sig": false, "md5_digest": "ddb028fac927f83c0cf0068d9a2ca31d", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 16363, "upload_time": "2019-11-12T20:43:53", "upload_time_iso_8601": "2019-11-12T20:43:53.801347Z", "url": "https://files.pythonhosted.org/packages/da/8e/3bdba817b341be0fdf24d36a921c12f0f30bddf66b5a4b8bbf5da6e1ea72/datamaker-0.1.4.tar.gz", "yanked": false, "yanked_reason": null } ], "1.0.1": [ { "comment_text": "", "digests": { "md5": "0bff9e5b55d4da11235efe9318e0f00f", "sha256": "21929fee57628dcc4f04efdc8e4842ad9b78490d8bf207e68d965feee4401fe0" }, "downloads": -1, "filename": "datamaker-1.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "0bff9e5b55d4da11235efe9318e0f00f", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 23886, "upload_time": "2020-10-03T21:36:01", "upload_time_iso_8601": "2020-10-03T21:36:01.258794Z", "url": "https://files.pythonhosted.org/packages/08/ed/91a0ca8ec92dee750a9ac5f43479a4764b7cbc3253c3be9e7ac90f251db3/datamaker-1.0.1-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "63987357bcd364cfee545b4860584dba", "sha256": "0a6a6005634b91732a9a73ddd3c493a618da607000d88f8c12320076a1b05cf8" }, "downloads": -1, "filename": "datamaker-1.0.1.tar.gz", "has_sig": false, "md5_digest": "63987357bcd364cfee545b4860584dba", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 16375, "upload_time": "2020-10-03T21:36:02", "upload_time_iso_8601": "2020-10-03T21:36:02.529055Z", "url": "https://files.pythonhosted.org/packages/69/94/d991decd55502b9f48b8f4b627491787ef7a76d06ab25753a8bad9764654/datamaker-1.0.1.tar.gz", "yanked": false, "yanked_reason": null } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "0bff9e5b55d4da11235efe9318e0f00f", "sha256": "21929fee57628dcc4f04efdc8e4842ad9b78490d8bf207e68d965feee4401fe0" }, "downloads": -1, "filename": "datamaker-1.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "0bff9e5b55d4da11235efe9318e0f00f", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.6", "size": 23886, "upload_time": "2020-10-03T21:36:01", "upload_time_iso_8601": "2020-10-03T21:36:01.258794Z", "url": "https://files.pythonhosted.org/packages/08/ed/91a0ca8ec92dee750a9ac5f43479a4764b7cbc3253c3be9e7ac90f251db3/datamaker-1.0.1-py3-none-any.whl", "yanked": false, "yanked_reason": null }, { "comment_text": "", "digests": { "md5": "63987357bcd364cfee545b4860584dba", "sha256": "0a6a6005634b91732a9a73ddd3c493a618da607000d88f8c12320076a1b05cf8" }, "downloads": -1, "filename": "datamaker-1.0.1.tar.gz", "has_sig": false, "md5_digest": "63987357bcd364cfee545b4860584dba", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6", "size": 16375, "upload_time": "2020-10-03T21:36:02", "upload_time_iso_8601": "2020-10-03T21:36:02.529055Z", "url": "https://files.pythonhosted.org/packages/69/94/d991decd55502b9f48b8f4b627491787ef7a76d06ab25753a8bad9764654/datamaker-1.0.1.tar.gz", "yanked": false, "yanked_reason": null } ], "vulnerabilities": [] }