{ "info": { "author": "Shotaro Tanaka", "author_email": "yubessy0@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "# sqlcsv\n\nSimple command line tool that can be used to:\n\n- SELECT data from database and export the result as CSV\n- INSERT data into database from CSV\n\nNote that it works only with Python 3, not 2.\n\n## Installation\n\nVia PyPI:\n\n```\n$ pip3 install sqlcsv\n```\n\nIt does not specify any database drivers as explicit dependencies, so install the one you need:\n\n```\n# MySQL\n$ pip3 install mysqlclient\n\n# PostgreSQL\n$ pip3 install psycopg2\n```\n\n## Basic usage\n\nIn the examples below following table schema with MySQL is used:\n\n```sql\nCREATE TABLE testtable(\n id INT AUTO_INCREMENT PRIMARY KEY,\n int_col INT,\n float_col FLOAT,\n varchar_col VARCHAR(255)\n)\n```\n\n### Database connection\n\nDatabase connection can be specified using `--db-url` option in the form of SQLAlchemy URL:\n\n```\n$ sqlcsv --db-url 'mysql://testuser:testpassword@127.0.0.1:3306/testdb' ...\n```\n\nAlso it will be read from `SQLCSV_DB_URL` environment variable if set:\n\n```\n$ export SQLCSV_DB_URL='mysql://testuser:testpassword@127.0.0.1:3306/testdb'\n$ sqlcsv ...\n```\n\nFrom here they are omitted from command line examples.\n\n### SELECT\n\nAssume we already have following records on the table:\n\n```\n+----+---------+-----------+-------------+\n| id | int_col | float_col | varchar_col |\n+----+---------+-----------+-------------+\n| 1 | 1 | 1 | aaa |\n| 2 | 2 | 2 | bbb |\n| 3 | NULL | NULL | NULL |\n+----+---------+-----------+-------------+\n```\n\nUse `select` subcommand and give `SELECT` query using `--sql` option:\n\n```\n$ sqlcsv select --sql 'SELECT * FROM testtable'\nid,int_col,float_col,varchar_col\n1,1,1.0,aaa\n2,2,2.0,bbb\n3,,,\n```\n\nIf you want to save the result to file, use `--outfile` option:\n\n```\n$ sqlcsv select --sql 'SELECT * FROM testtable' --outfile out.csv\n```\n\n### INSERT\n\nAssume we already have following dataset in `input.csv`:\n\n```\nint_col,float_col,varchar_col\n1,1.0,aaa\n2,2.0,bbb\n```\n\nUse `insert` subcommand and give `INSERT` query with placeholders using `--sql` option, followed by `--types` option specifying types of each field:\n\n```\n$ sqlcsv insert \\\n --sql 'INSERT INTO testtable(int_col, float_col, varchar_col) VALUES (%s, %s, %s)' \\\n --infle input.csv --types int,float,str\n```\n\nThe resulted records in the table would be:\n\n```\n+----+---------+-----------+-------------+\n| id | int_col | float_col | varchar_col |\n+----+---------+-----------+-------------+\n| 1 | 1 | 1 | aaa |\n| 2 | 2 | 2 | bbb |\n+----+---------+-----------+-------------+\n```\n\nNote that type names in `--types` are the same as Python primitive type function names.\nAlso it can be short form like `--types i,f,s`\n\nCurrently it supports only `int`, `float` and `str`.\n\n#### NULLs\n\nYou may have CSV file contains empty cell like:\n\n```\nint_col,float_col,varchar_col\n1,,aaa\n2,2.0,\n```\n\nIf you want to treat them as 'NULL' in database, use `--nullable` option to convert them before insertion:\n\n```\n$ sqlcsv insert\n --sql 'INSERT INTO testtable(int_col, float_col, varchar_col) VALUES (%s, %s, %s)\n --infile input.csv --types int,float,str --nullable false,true,true \\\n```\n\nThe result would be:\n\n```\n+----+---------+-----------+-------------+\n| id | int_col | float_col | varchar_col |\n+----+---------+-----------+-------------+\n| 1 | 1 | NULL | aaa |\n| 2 | 2 | 2 | NULL |\n+----+---------+-----------+-------------+\n```\n\nNote that values of `--nullable` have to be one of `true` or `false`, and they can also be written as `t` or `f` in a short form.\n\n## More options\n\n### CSV dialect\n\nIf your desired input or output is tab-separated (TSV), use `--tab` option:\n\n```\n$ sqlcsv --tab select --sql 'SELECT * FROM testtable'\nid\tint_col\tfloat_col\tvarchar_col\n1\t1\t1.0\taaa\n2\t2\t2.0\tbbb\n```\n\nFor other format settings, see `sqlcsv --help`.\nBasically it supports the same dialect specification as [csv package in Python's standard libraries](https://docs.python.org/3/library/csv.html) does.\n\n### Read SQL from file\n\nIn both `select` and `insert` subcommands you can use `--sqlfile` option intead of `--sql` in order to read query from a file:\n\n```\n$ sqlcsv select --sqlfile query.sql\n$ sqlcsv insert --sqlfile query.sql ...\n```\n\n### Pre and post querying\n\nIn case you need to execute short query before/after the main query runs, it provides `--pre-sql` and `--post-sql` options to satisfy such needs:\n\n```\n$ sqlcsv select --pre-sql 'SET SESSION wait_timeout = 60' --sql ...\n```\n\n### Chunked insertion\n\nWhen you import a large number of records into database, `--chunk-size` option is helpful to save memory usage by splitting file contents up into different pieces and transfer each of them to the database repeatedly.\n\n```\n$ sqlcsv insert --sql ... --infile ... --types ... --chunk-size 1000\n```\n\n### Run in transaction\n\nIf you want multiple queries executed in single command call such as ones specified by `--pre-sql` or `--post-sql` to be run in the same transaction, use `--transaction` option as follows:\n\n```\n$ sqlcsv --transaction select --pre-sql ... --post-sql ... --sql ...\n```\n\nIt is also a good practice to use this option with `--chunk-size` in order to execute chunked insersion atomically and to avoid leaving incomplete data on table when the query is cancelled or aborted.\n\n## Comparison between other tools\n\n### LOAD or COPY\n\nMajor RDBMSs usually have built-in instructions to import data from files such as `LOAD` for MySQL or `COPY` for PostgreSQL.\nThey are obviously the primary choices you may consider but also have some limitations:\n\n- Few platform support import/export across network; others only can do from local files\n- Specification for data format or instruction varies for each platform\n\nSqlcsv works remotely and provides unified interfaces (except SQL dialects).\n\n### CSVKit\n\n[CSVKit](https://csvkit.readthedocs.io) is a popular toolkit for manipulating CSV files.\nIt provides [sql2csv](https://csvkit.readthedocs.io/en/latest/scripts/sql2csv.html) and [csvsql](https://csvkit.readthedocs.io/en/latest/scripts/csvsql.html) commands for export/import data from/to SQL databases.\n**Consider using them before choosing sqlcsv** if they just satisfy your needs, as they have much more users and contributers.\nHoever, there sill might be a few reasons to prefer sqlcsv to them (and this is why it was created):\n\n- CSVKit depends on several libraries including [agate](https://agate.readthedocs.io/) but not all of them are needed for interoperability between SQL databases and CSV files.\n Sqlcsv uses [csv package in Python's standard libraries](https://docs.python.org/3/library/csv.html) to interact with CSV files and [SQLAlchemy](https://www.sqlalchemy.org/) to query SQL databases, which leads to less library dependencies.\n- CSVKit's csvsql command takes just table name for import, which make it easy to use.\n However, it is sometimes inconvenient in such cases where CSV file includes only a part of columns and others are generated dynamically by SQL expressions.\n Sqlcsv's `insert` subcommand, by contrast, takes `INSERT` statement, which might be verbose but provides more flexibility.\n\n### Pandas\n\nIf you do not care about library dependencies, do not need custom `INSERT` statement to be specified, and even do not need command line interfaces, then just use [pandas](https://pandas.pydata.org/)' [DataFrame.to_sql](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_sql.html) or [read_sql](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_sql.html).\nThey will help you a lot if used with [DataFrame.to_csv](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.html) or [read_csv](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html).\n\n### Embulk\n\nIf your dataset is so large that requires optimization such as parallel processing, or you want some sophisticated I/O functionality such as retrying, consider using [Embulk](https://github.com/embulk/embulk).\nIt also provides well-developed plugin ecosystem that enables support of various data stores and data formats.", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/yubessy/sqlcsv", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "sqlcsv", "package_url": "https://pypi.org/project/sqlcsv/", "platform": "", "project_url": "https://pypi.org/project/sqlcsv/", "project_urls": { "Homepage": "https://github.com/yubessy/sqlcsv" }, "release_url": "https://pypi.org/project/sqlcsv/0.4.1/", "requires_dist": null, "requires_python": "", "summary": "Import/Export data to/from relational databases using SQL statements with CSV files", "version": "0.4.1" }, "last_serial": 4576553, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "62904e29907f3740e10984c9458b3f0c", "sha256": "cbcd50f341668a65f47960947cb6370f696405164c3bf67d924dee35c604c449" }, "downloads": -1, "filename": "sqlcsv-0.1.0.tar.gz", "has_sig": false, "md5_digest": "62904e29907f3740e10984c9458b3f0c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6542, "upload_time": "2018-11-25T12:03:53", "url": "https://files.pythonhosted.org/packages/4f/d2/0e81af24a4978031c482bf880c4cc59111b17262035fef008cd45e75acc2/sqlcsv-0.1.0.tar.gz" } ], "0.1.1": [ { "comment_text": "", "digests": { "md5": "08d54c7c447a3fc4387c6d3aa98c3240", "sha256": "4c5658783635536d8a721f53ef38809ced02122f874cde526799a70fa52330a5" }, "downloads": -1, "filename": "sqlcsv-0.1.1.tar.gz", "has_sig": false, "md5_digest": "08d54c7c447a3fc4387c6d3aa98c3240", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6506, "upload_time": "2018-11-25T12:25:30", "url": "https://files.pythonhosted.org/packages/a0/3e/867c80cf49052368dae20a4c961deb2eae74a0b74d1312ce2e581da065e9/sqlcsv-0.1.1.tar.gz" } ], "0.2.0": [ { "comment_text": "", "digests": { "md5": "4c4f70e945e0990f0e5dd9748f9c9b1c", "sha256": "d6b7780615fa5156cb12524dbacb0670580a41d107e856d2e7d7aeb80d8f3201" }, "downloads": -1, "filename": "sqlcsv-0.2.0.tar.gz", "has_sig": false, "md5_digest": "4c4f70e945e0990f0e5dd9748f9c9b1c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7417, "upload_time": "2018-11-25T13:24:02", "url": "https://files.pythonhosted.org/packages/7f/62/dabd26f3f15a1929e345b3f54db2d393e7b8ee412214ca87d556487f1cd6/sqlcsv-0.2.0.tar.gz" } ], "0.3.0": [ { "comment_text": "", "digests": { "md5": "e6f79a7fda9f20851cc59ed475868a74", "sha256": "eb020473b84e9919317598f0174d0741f3b6a6e7eb3b9cb60646ac38977f5797" }, "downloads": -1, "filename": "sqlcsv-0.3.0.tar.gz", "has_sig": false, "md5_digest": "e6f79a7fda9f20851cc59ed475868a74", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7413, "upload_time": "2018-11-27T09:08:53", "url": "https://files.pythonhosted.org/packages/b5/3b/8543b3903185dbdf0891b80651815ad06cea0184c3d3ff0488fd2c4b0e53/sqlcsv-0.3.0.tar.gz" } ], "0.4.0": [ { "comment_text": "", "digests": { "md5": "d13222d731a59d6e01328fbfccfa7003", "sha256": "95cda8af5ac101d7cf24a5d71ee58451ad704ef411401230f45968f8bd60350a" }, "downloads": -1, "filename": "sqlcsv-0.4.0.tar.gz", "has_sig": false, "md5_digest": "d13222d731a59d6e01328fbfccfa7003", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6961, "upload_time": "2018-12-09T01:47:18", "url": "https://files.pythonhosted.org/packages/f2/df/2bc85a43d05f04f50c1b29ecde74c81790ecce4851f74e11a1df6b37790b/sqlcsv-0.4.0.tar.gz" } ], "0.4.1": [ { "comment_text": "", "digests": { "md5": "cfe23ec5665abec31200ad4f77d3bac5", "sha256": "efe01f03df36351baa9e1927466c5d05fc93c3fcb458a51682e6a89545b6fcfd" }, "downloads": -1, "filename": "sqlcsv-0.4.1.tar.gz", "has_sig": false, "md5_digest": "cfe23ec5665abec31200ad4f77d3bac5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8442, "upload_time": "2018-12-09T03:33:25", "url": "https://files.pythonhosted.org/packages/5d/6a/a8c80247d5f61fce89ecf7c808e5276ba8130b36de772a4a2e431bbc40e1/sqlcsv-0.4.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "cfe23ec5665abec31200ad4f77d3bac5", "sha256": "efe01f03df36351baa9e1927466c5d05fc93c3fcb458a51682e6a89545b6fcfd" }, "downloads": -1, "filename": "sqlcsv-0.4.1.tar.gz", "has_sig": false, "md5_digest": "cfe23ec5665abec31200ad4f77d3bac5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8442, "upload_time": "2018-12-09T03:33:25", "url": "https://files.pythonhosted.org/packages/5d/6a/a8c80247d5f61fce89ecf7c808e5276ba8130b36de772a4a2e431bbc40e1/sqlcsv-0.4.1.tar.gz" } ] }