{ "info": { "author": "Vladimir Smelov", "author_email": "vsmelov@vsmelov.ru", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Programming Language :: Python :: 3", "Topic :: Database", "Topic :: Utilities" ], "description": "# pyspark_db_utils \n\nIt helps you with your DB deals in Spark\n\n## Documentation\n\nhttp://pyspark-db-utils.readthedocs.io/en/latest/\n\n## Example of using\n\nYou need jdbc drivers for using this lib!\nJust get drivers from\nhttps://jdbc.postgresql.org/download.html\nhttps://github.com/yandex/clickhouse-jdbc\nand put it in jars/ directory in your project\n\n### Example settings:\n```\nsettings = {\n \"PG_PROPERTIES\": {\n \"user\": \"user\",\n \"password\": \"pass\",\n \"driver\": \"org.postgresql.Driver\"\n },\n \"PG_DRIVER_PATH\": \"jars/postgresql-42.1.4.jar\",\n \"PG_URL\": \"jdbc:postgresql://db.olabs.com/dbname\",\n}\n```\n\n### Example of code\n\nsee example.py\n\n### Example of run\n```\nvsmelov@vsmelov:~/PycharmProjects/pyspark_db_utils$ mkdir jars\nvsmelov@vsmelov:~/PycharmProjects/pyspark_db_utils$ cp /var/bigdata/spark-2.2.0-bin-hadoop2.7/jars/postgresql-42.1.4.jar ./jars/\nvsmelov@vsmelov:~/PycharmProjects/pyspark_db_utils$ python3 pyspark_db_utils/example.py \nhost: ***SECRET***\ndb: ***SECRET***\nuser: ***SECRET***\npassword: ***SECRET***\n\nUsing Spark's default log4j profile: org/apache/spark/log4j-defaults.properties\nSetting default log level to \"WARN\".\nTo adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).\n18/03/05 11:43:29 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable\n18/03/05 11:43:29 WARN Utils: Your hostname, vsmelov resolves to a loopback address: 127.0.1.1; using 192.168.43.26 instead (on interface wlp2s0)\n18/03/05 11:43:29 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address\nTRY: create df\nOK: create df\n+---+-----------+\n| id| mono_id|\n+---+-----------+\n| 1| 0|\n| 2| 1|\n| 3| 2|\n| 4| 3|\n| 5| 8589934592|\n| 6| 8589934593|\n| 7| 8589934594|\n| 8| 8589934595|\n| 9| 8589934596|\n| 10|17179869184|\n| 11|17179869185|\n| 12|17179869186|\n| 13|17179869187|\n| 14|17179869188|\n| 15|25769803776|\n| 16|25769803777|\n| 17|25769803778|\n| 18|25769803779|\n| 19|25769803780|\n+---+-----------+\n\n\nTRY: write_to_pg\nOK: write_to_pg \n\nTRY: read_from_pg\nOK: read_from_pg\n+---+-----------+\n| id| mono_id|\n+---+-----------+\n| 10|17179869184|\n| 11|17179869185|\n| 12|17179869186|\n| 13|17179869187|\n| 14|17179869188|\n| 1| 0|\n| 2| 1|\n| 3| 2|\n| 4| 3|\n| 5| 8589934592|\n| 6| 8589934593|\n| 7| 8589934594|\n| 8| 8589934595|\n| 9| 8589934596|\n| 15|25769803776|\n| 16|25769803777|\n| 17|25769803778|\n| 18|25769803779|\n| 19|25769803780|\n| 1| 0|\n+---+-----------+\nonly showing top 20 rows\n\n```", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/osahp/pyspark_db_utils", "keywords": "postgres postgresql clickhouse sql pyspark spark", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "pyspark_db_utils", "package_url": "https://pypi.org/project/pyspark_db_utils/", "platform": "", "project_url": "https://pypi.org/project/pyspark_db_utils/", "project_urls": { "Homepage": "https://github.com/osahp/pyspark_db_utils" }, "release_url": "https://pypi.org/project/pyspark_db_utils/0.0.7/", "requires_dist": null, "requires_python": "", "summary": "Usefull functions for working with Database in PySpark (PostgreSQL, ClickHouse)", "version": "0.0.7" }, "last_serial": 3894667, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "40e29c66c9fe47a4e2fd633cd3fbeaa8", "sha256": "fe51d8aa734b2f3adc9d4777ba4025d2c87a52607fbfe414b92013517bc6f69c" }, "downloads": -1, "filename": "pyspark_db_utils-0.0.1.tar.gz", "has_sig": false, "md5_digest": "40e29c66c9fe47a4e2fd633cd3fbeaa8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4927, "upload_time": "2017-10-20T12:28:44", "url": "https://files.pythonhosted.org/packages/dd/22/c029444468c9bb92a78a75ab1275e3d0a23c5cf9459831227db54450ae63/pyspark_db_utils-0.0.1.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "5319f69ed85309bd8b8caf07fcb7e3ed", "sha256": "9258d1b357bdc6ee87fea8b172cbe6db528b5ba9632e8b0952c21ebcb1d76d57" }, "downloads": -1, "filename": "pyspark_db_utils-0.0.3.tar.gz", "has_sig": false, "md5_digest": "5319f69ed85309bd8b8caf07fcb7e3ed", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6475, "upload_time": "2018-03-05T08:21:43", "url": "https://files.pythonhosted.org/packages/e7/4c/38ffe31a254a31104820be35891c56fcab5b6f42724a962cb8421b097991/pyspark_db_utils-0.0.3.tar.gz" } ], "0.0.4": [ { "comment_text": "", "digests": { "md5": "540c0006132752114be456160ea4f3f9", "sha256": "97256c076750b36a43123908736f0d0dad987b14c3b8e251d61478085c472585" }, "downloads": -1, "filename": "pyspark_db_utils-0.0.4.tar.gz", "has_sig": false, "md5_digest": "540c0006132752114be456160ea4f3f9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9770, "upload_time": "2018-03-05T08:32:59", "url": "https://files.pythonhosted.org/packages/19/d2/ba25666ceefc03abf356f08c1cf185aa2e76dc039455fd46a3dc6f0f69bb/pyspark_db_utils-0.0.4.tar.gz" } ], "0.0.5": [ { "comment_text": "", "digests": { "md5": "412b8b5f695e5cffb3409cedc4b5d809", "sha256": "0067b3c03b1d2e993c4ffe0f1728b9c9779956f93a5d8eac05c16c4aa0e71378" }, "downloads": -1, "filename": "pyspark_db_utils-0.0.5.tar.gz", "has_sig": false, "md5_digest": "412b8b5f695e5cffb3409cedc4b5d809", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11209, "upload_time": "2018-03-05T08:46:10", "url": "https://files.pythonhosted.org/packages/c2/2a/3088de4e8f1507da8235a73d5c59edb82d6be49232cb2843a4537a1af06a/pyspark_db_utils-0.0.5.tar.gz" } ], "0.0.6": [ { "comment_text": "", "digests": { "md5": "06a0bdb3545f7d8464122aba1b6fedd4", "sha256": "78c28b34eb10458e8803fc782b37d720915a977316bfa6a04c88789f065c8e21" }, "downloads": -1, "filename": "pyspark_db_utils-0.0.6.tar.gz", "has_sig": false, "md5_digest": "06a0bdb3545f7d8464122aba1b6fedd4", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6275472, "upload_time": "2018-05-24T10:20:37", "url": "https://files.pythonhosted.org/packages/22/66/3969c09c5658eeb690cb0065d157ce5d8d0d87390803b6eaf8fa2dbf0840/pyspark_db_utils-0.0.6.tar.gz" } ], "0.0.7": [ { "comment_text": "", "digests": { "md5": "d7f4ec1ad307820f2c31b0a4ae30cc57", "sha256": "1dd91ac364b2278c41fb5d9f40cec7e1b3b702faf1d97dea775025e814331840" }, "downloads": -1, "filename": "pyspark_db_utils-0.0.7.tar.gz", "has_sig": false, "md5_digest": "d7f4ec1ad307820f2c31b0a4ae30cc57", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6275533, "upload_time": "2018-05-24T10:28:04", "url": "https://files.pythonhosted.org/packages/f3/36/a9973011d289be4c78b60dbb033c57f109bb5d7fc0f15af2370048183660/pyspark_db_utils-0.0.7.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "d7f4ec1ad307820f2c31b0a4ae30cc57", "sha256": "1dd91ac364b2278c41fb5d9f40cec7e1b3b702faf1d97dea775025e814331840" }, "downloads": -1, "filename": "pyspark_db_utils-0.0.7.tar.gz", "has_sig": false, "md5_digest": "d7f4ec1ad307820f2c31b0a4ae30cc57", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6275533, "upload_time": "2018-05-24T10:28:04", "url": "https://files.pythonhosted.org/packages/f3/36/a9973011d289be4c78b60dbb033c57f109bb5d7fc0f15af2370048183660/pyspark_db_utils-0.0.7.tar.gz" } ] }