{ "info": { "author": "Evgeny Medvedev", "author_email": "evge.medvedev@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 5 - Production/Stable", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7" ], "description": "# Bitcoin ETL\n\n[![Join the chat at https://gitter.im/ethereum-eth](https://badges.gitter.im/ethereum-etl.svg)](https://gitter.im/ethereum-etl/Lobby?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)\n[![Build Status](https://travis-ci.org/blockchain-etl/bitcoin-etl.png)](https://travis-ci.org/blockchain-etl/bitcoin-etl)\n[Join Telegram Group](https://t.me/joinchat/GsMpbA3mv1OJ6YMp3T5ORQ)\n\nInstall Bitcoin ETL:\n\n```bash\npip install bitcoin-etl\n```\n\nExport blocks and transactions ([Schema](#blocksjson), [Reference](#export_blocks_and_transactions)):\n\n```bash\n> bitcoinetl export_blocks_and_transactions --start-block 0 --end-block 500000 \\\n--provider-uri http://user:pass@localhost:8332 --chain bitcoin \\\n --blocks-output blocks.json --transactions-output transactions.json\n```\n\nSupported chains:\n- bitcoin\n- bitcoin_cash\n- dogecoin\n- litecoin\n- dash\n- zcash\n\nStream blockchain data continually to console ([Reference](#stream)):\n\n```bash\n> pip install bitcoin-etl[streaming]\n> bitcoinetl stream -p http://user:pass@localhost:8332 --start-block 500000\n```\n\nStream blockchain data continually to Google Pub/Sub ([Reference](#stream)):\n\n```bash\n> export GOOGLE_APPLICATION_CREDENTIALS=/path_to_credentials_file.json\n> bitcoinetl stream -p http://user:pass@localhost:8332 --start-block 500000 --output projects/your-project/topics/crypto_bitcoin\n\n```\n\nFor the latest version, check out the repo and call\n```bash\n> pip install -e .[streaming]\n> python bitcoinetl.py\n```\n\n## Table of Contents\n\n- [Schema](#schema)\n - [blocks.json](#blocksjson)\n - [transactions.json](#transactionsjson)\n- [Exporting the Blockchain](#exporting-the-blockchain)\n - [Running in Docker](#running-in-docker)\n - [Command Reference](#command-reference)\n- [Public Datasets in BigQuery](#public-datasets-in-bigquery)\n\n\n## Schema\n\n### blocks.json\n\nField | Type |\n--------------------|-----------------|\nhash | hex_string |\nsize | bigint |\nstripped_size | bigint |\nweight | bigint |\nnumber | bigint |\nversion | bigint |\nmerkle_root | hex_string |\ntimestamp | bigint |\nnonce | hex_string |\nbits | hex_string |\ncoinbase_param | hex_string |\ntransaction_count | bigint |\n\n### transactions.json\n\nField | Type |\n------------------------|-----------------------|\nhash | hex_string |\nsize | bigint |\nvirtual_size | bigint |\nversion | bigint |\nlock_time | bigint |\nblock_number | bigint |\nblock_hash | hex_string |\nblock_timestamp | bigint |\nis_coinbase | boolean |\nindex | bigint |\ninputs | []transaction_input |\noutputs | []transaction_output |\ninput_count | bigint |\noutput_count | bigint |\ninput_value | bigint |\noutput_value | bigint |\nfee | bigint |\n\n### transaction_input\n\nField | Type |\n------------------------|-----------------------|\nindex | bigint |\nspent_transaction_hash | hex_string |\nspent_output_index | bigint |\nscript_asm | string |\nscript_hex | hex_string |\nsequence | bigint |\nrequired_signatures | bigint |\ntype | string |\naddresses | []string |\nvalue | bigint |\n\n### transaction_output\n\nField | Type |\n------------------------|-----------------------|\nindex | bigint |\nscript_asm | string |\nscript_hex | hex_string |\nrequired_signatures | bigint |\ntype | string |\naddresses | []string |\nvalue | bigint |\n\n\nYou can find column descriptions in [schemas](https://github.com/blockchain-etl/bitcoin-etl-airflow/tree/master/dags/resources/stages/enrich/schemas)\n\n**Notes**:\n\n1. Output values returned by Dogecoin API had precision loss in the clients prior to version 1.14.\nIt's caused by this issue https://github.com/dogecoin/dogecoin/issues/1558\nThe explorers that used older versions to export the data may show incorrect address balances and transaction amounts.\n\n1. For Zcash, `vjoinsplit` and `valueBalance` fields are converted to inputs and outputs with type 'shielded'\nhttps://zcash-rpc.github.io/getrawtransaction.html, https://zcash.readthedocs.io/en/latest/rtd_pages/zips/zip-0243.html\n\n\n## Exporting the Blockchain\n\n1. Install python 3.5.3+ https://www.python.org/downloads/\n\n1. Install Bitcoin node https://hackernoon.com/a-complete-beginners-guide-to-installing-a-bitcoin-full-node-on-linux-2018-edition-cb8e384479ea\n\n1. Start Bitcoin.\nMake sure it downloaded the blocks that you need by executing `$ bitcoin-cli getblockchaininfo` in the terminal.\nYou can export blocks below `blocks`, there is no need to wait until the full sync\n\n1. Install Bitcoin ETL:\n\n ```bash\n > pip install bitcoin-etl\n ```\n\n1. Export blocks & transactions:\n\n ```bash\n > bitcoinetl export_all --start 0 --end 499999 \\\n --partition-batch-size 100 \\\n --provider-uri http://user:pass@localhost:8332 --chain bitcoin\n ```\n\n The result will be in the `output` subdirectory, partitioned in Hive style:\n\n ```bash\n output/blocks/start_block=00000000/end_block=00000099/blocks_00000000_00000099.csv\n output/blocks/start_block=00000100/end_block=00000199/blocks_00000100_=00000199.csv\n ...\n output/transactions/start_block=00000000/end_block=00000099/transactions_00000000_00000099.csv\n ...\n ```\n\n In case `bitcoinetl` command is not available in PATH, use `python -m bitcoinetl` instead.\n\n### Running in Docker\n\n1. Install Docker https://docs.docker.com/install/\n\n1. Build a docker image\n ```bash\n > docker build -t bitcoin-etl:latest .\n > docker image ls\n ```\n\n1. Run a container out of the image\n ```bash\n > docker run -v $HOME/output:/bitcoin-etl/output bitcoin-etl:latest export_blocks_and_transactions --start-block 0 --end-block 500000 \\\n --rpc-pass '' --rpc-host 'localhost' --rpc-user '' --blocks-output blocks.json --transactions-output transactions.json\n ```\n\n1. Run streaming to console or Pub/Sub\n ```bash\n > docker build -t bitcoin-etl:latest-streaming -f Dockerfile_with_streaming .\n > echo \"Stream to console\"\n > docker run bitcoin-etl:latest-streaming stream -p http://user:pass@localhost:8332 --start-block 500000\n > echo \"Stream to Pub/Sub\"\n > docker run -v /path_to_credentials_file/:/bitcoin-etl/ --env GOOGLE_APPLICATION_CREDENTIALS=/bitcoin-etl/credentials_file.json bitcoin-etl:latest-streaming stream -p http://user:pass@localhost:8332 --start-block 500000 --output projects/your-project/topics/crypto_bitcoin\n ```\n\n1. Refer to https://github.com/blockchain-etl/bitcoin-etl-streaming for deploying the streaming app to\nGoogle Kubernetes Engine.\n\n### Command Reference\n\n- [export_blocks_and_transactions](#export_blocks_and_transactions)\n- [enrich_transactions](#enrich_transactions)\n- [get_block_range_for_date](#get_block_range_for_date)\n- [export_all](#export_all)\n- [stream](#stream)\n\nAll the commands accept `-h` parameter for help, e.g.:\n\n```bash\n> bitcoinetl export_blocks_and_transactions --help\nUsage: bitcoinetl.py export_blocks_and_transactions [OPTIONS]\n\n Export blocks and transactions.\n\nOptions:\n -s, --start-block INTEGER Start block\n -e, --end-block INTEGER End block [required]\n -b, --batch-size INTEGER The number of blocks to export at a time.\n -p, --provider-uri TEXT The URI of the remote Bitcoin node\n -w, --max-workers INTEGER The maximum number of workers.\n --blocks-output TEXT The output file for blocks. If not provided\n blocks will not be exported. Use \"-\" for stdout\n --transactions-output TEXT The output file for transactions. If not\n provided transactions will not be exported. Use\n \"-\" for stdout\n --help Show this message and exit.\n```\n\nFor the `--output` parameters the supported type is json. The format type is inferred from the output file name.\n\n#### export_blocks_and_transactions\n\n```bash\n> bitcoinetl export_blocks_and_transactions --start-block 0 --end-block 500000 \\\n --provider-uri http://user:pass@localhost:8332 \\\n --blocks-output blocks.json --transactions-output transactions.json\n```\n\nOmit `--blocks-output` or `--transactions-output` options if you want to export only transactions/blocks.\n\nYou can tune `--batch-size`, `--max-workers` for performance.\n\nNote that `required_signatures`, `type`, `addresses`, and `value` fields will be empty in transactions inputs.\nUse [enrich_transactions](#enrich_transactions) to populate those fields.\n\n#### enrich_transactions\n\n```bash\n> bitcoinetl enrich_transactions \\\n --provider-uri http://user:pass@localhost:8332 \\\n --transactions-input transactions.json --transactions-output enriched_transactions.json\n```\n\nYou can tune `--batch-size`, `--max-workers` for performance.\n\n#### get_block_range_for_date\n\n```bash\n> bitcoinetl get_block_range_for_date --provider-uri http://user:pass@localhost:8332 --date=2017-03-01\n```\n\nThis command is guaranteed to return the block range that covers all blocks with `block.time` on the specified\ndate. However the returned block range may also contain blocks outside the specified date, because block times are not\nmonotonic https://twitter.com/EvgeMedvedev/status/1073844856009576448. You can filter\n`blocks.json`/`transactions.json` with the below command:\n\n```bash\n> bitcoinetl filter_items -i blocks.json -o blocks_filtered.json \\\n-p \"datetime.datetime.fromtimestamp(item['timestamp']).astimezone(datetime.timezone.utc).strftime('%Y-%m-%d') == '2017-03-01'\"\n```\n\n#### export_all\n\n```bash\n> bitcoinetl export_all --provider-uri http://user:pass@localhost:8332 --start 2018-01-01 --end 2018-01-02\n```\n\nYou can tune `--export-batch-size`, `--max-workers` for performance.\n\n#### stream\n\n```bash\n> bitcoinetl stream --provider-uri http://user:pass@localhost:8332 --start-block 500000\n```\n\n- This command outputs blocks and transactions to the console by default.\n- Use `--output` option to specify the Google Pub/Sub topic where to publish blockchain data,\ne.g. `projects/your-project/topics/crypto_bitcoin`. Blocks and transactions will be pushed to\n`projects/your-project/topics/crypto_bitcoin.blocks` and `projects/your-project/topics/crypto_bitcoin.transactions`\ntopics.\n- The command saves its state to `last_synced_block.txt` file where the last synced block number is saved periodically.\n- Specify either `--start-block` or `--last-synced-block-file` option. `--last-synced-block-file` should point to the\nfile where the block number, from which to start streaming the blockchain data, is saved.\n- Use the `--lag` option to specify how many blocks to lag behind the head of the blockchain. It's the simplest way to\nhandle chain reorganizations - they are less likely the further a block from the head.\n- Use the `--chain` option to specify the type of the chain, e.g. `bitcoin`, `litecoin`, `dash`, `zcash`, etc.\n- You can tune `--period-seconds`, `--batch-size`, `--max-workers` for performance.\n\n\n### Running Tests\n\n```bash\n> pip install -e .[dev]\n> echo \"The below variables are optional\"\n> export BITCOINETL_BITCOIN_PROVIDER_URI=http://user:pass@localhost:8332\n> export BITCOINETL_LITECOIN_PROVIDER_URI=http://user:pass@localhost:8331\n> export BITCOINETL_DOGECOIN_PROVIDER_URI=http://user:pass@localhost:8330\n> export BITCOINETL_BITCOIN_CASH_PROVIDER_URI=http://user:pass@localhost:8329\n> export BITCOINETL_DASH_PROVIDER_URI=http://user:pass@localhost:8328\n> export BITCOINETL_ZCASH_PROVIDER_URI=http://user:pass@localhost:8327\n> pytest -vv\n```\n\n### Running Tox Tests\n\n```bash\n> pip install tox\n> tox\n```\n\n### Public Datasets in BigQuery\n\nhttps://cloud.google.com/blog/products/data-analytics/introducing-six-new-cryptocurrencies-in-bigquery-public-datasets-and-how-to-analyze-them", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/blockchain-etl/bitcoin-etl", "keywords": "bitcoin", "license": "", "maintainer": "", "maintainer_email": "", "name": "bitcoin-etl", "package_url": "https://pypi.org/project/bitcoin-etl/", "platform": "", "project_url": "https://pypi.org/project/bitcoin-etl/", "project_urls": { "Bug Reports": "https://github.com/blockchain-etl/bitcoin-etl/issues", "Chat": "https://gitter.im/ethereum-etl/Lobby", "Homepage": "https://github.com/blockchain-etl/bitcoin-etl", "Source": "https://github.com/blockchain-etl/bitcoin-etl" }, "release_url": "https://pypi.org/project/bitcoin-etl/1.3.1/", "requires_dist": null, "requires_python": ">=3.5.0,<3.8.0", "summary": "Tools for exporting Bitcoin blockchain data to JSON", "version": "1.3.1" }, "last_serial": 5975069, "releases": { "1.0.0": [ { "comment_text": "", "digests": { "md5": "66df7bf2513e808175fe51e3d246c251", "sha256": "d48f935db30bcae27a0ffa8cdb2a0b46fe48f29e0c20f3e083ca140222ff43b0" }, "downloads": -1, "filename": "bitcoin-etl-1.0.0.tar.gz", "has_sig": false, "md5_digest": "66df7bf2513e808175fe51e3d246c251", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5.3,<3.8.0", "size": 31700, "upload_time": "2019-01-12T17:51:19", "url": "https://files.pythonhosted.org/packages/5b/c8/5aa5989c4140cdf7c6978fd71e9ca13585ab18be9bf2e475a4c3335125a2/bitcoin-etl-1.0.0.tar.gz" } ], "1.1.0": [ { "comment_text": "", "digests": { "md5": "e89fa4d4142fb0597fd230a4e753762d", "sha256": "dfb578c48231d8af97496f35d055db8c475b8bd864543ac4bd638fdedeada2a6" }, "downloads": -1, "filename": "bitcoin-etl-1.1.0.tar.gz", "has_sig": false, "md5_digest": "e89fa4d4142fb0597fd230a4e753762d", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5.0,<3.8.0", "size": 35459, "upload_time": "2019-01-24T18:32:06", "url": "https://files.pythonhosted.org/packages/26/0d/e5594ddebd789e7e1959c9fc66bbd01a5773bec24b234fbc4467739c7d47/bitcoin-etl-1.1.0.tar.gz" } ], "1.2.0": [ { "comment_text": "", "digests": { "md5": "163fea7c4f8b903989e6094b6518c808", "sha256": "2b39b00b7f357873e32de2fea36ea63f5f71bf75247a533c07620ed35427d107" }, "downloads": -1, "filename": "bitcoin-etl-1.2.0.tar.gz", "has_sig": false, "md5_digest": "163fea7c4f8b903989e6094b6518c808", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5.0,<3.8.0", "size": 41192, "upload_time": "2019-03-22T16:04:55", "url": "https://files.pythonhosted.org/packages/a2/05/a2aaa69610c9ddc77e933bdc0a463758670e8435a0a74b376b1d659bdf7d/bitcoin-etl-1.2.0.tar.gz" } ], "1.2.1": [ { "comment_text": "", "digests": { "md5": "144838259726b2207422db389bb19c40", "sha256": "e6929bf3c5197ef32702fb13e586871420c2dd332df31d37f6c664431f43e63f" }, "downloads": -1, "filename": "bitcoin-etl-1.2.1.tar.gz", "has_sig": false, "md5_digest": "144838259726b2207422db389bb19c40", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5.0,<3.8.0", "size": 43203, "upload_time": "2019-04-15T13:31:00", "url": "https://files.pythonhosted.org/packages/1c/f7/642ee81865a2223ee388ea71b758f258ea548d3fca3cbf6786065025bc1d/bitcoin-etl-1.2.1.tar.gz" } ], "1.3.0": [ { "comment_text": "", "digests": { "md5": "bb8a6bb5c893c039f2606977c0dd9f6c", "sha256": "1f1f260bfabe9b3e8d435981c38a5d0045fb594fc61ba3301b7e3fd131176b38" }, "downloads": -1, "filename": "bitcoin-etl-1.3.0.tar.gz", "has_sig": false, "md5_digest": "bb8a6bb5c893c039f2606977c0dd9f6c", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5.0,<3.8.0", "size": 43698, "upload_time": "2019-07-08T07:39:19", "url": "https://files.pythonhosted.org/packages/9e/03/e1c9c9539fe5254bd38a07202e35c67fecd09fcbb36a0e01d749132866ba/bitcoin-etl-1.3.0.tar.gz" } ], "1.3.1": [ { "comment_text": "", "digests": { "md5": "db0d431f19e8770c556653b9504fb3b4", "sha256": "750f6ac6c56b083cae718eff875c32a7b1a6416f8f212b14a4f9fe28f766b478" }, "downloads": -1, "filename": "bitcoin-etl-1.3.1.tar.gz", "has_sig": false, "md5_digest": "db0d431f19e8770c556653b9504fb3b4", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5.0,<3.8.0", "size": 37524, "upload_time": "2019-10-15T06:27:35", "url": "https://files.pythonhosted.org/packages/2e/23/2eec9fb992d0fa88461fecc8feca7730c0926519aa503f29d1b764812681/bitcoin-etl-1.3.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "db0d431f19e8770c556653b9504fb3b4", "sha256": "750f6ac6c56b083cae718eff875c32a7b1a6416f8f212b14a4f9fe28f766b478" }, "downloads": -1, "filename": "bitcoin-etl-1.3.1.tar.gz", "has_sig": false, "md5_digest": "db0d431f19e8770c556653b9504fb3b4", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5.0,<3.8.0", "size": 37524, "upload_time": "2019-10-15T06:27:35", "url": "https://files.pythonhosted.org/packages/2e/23/2eec9fb992d0fa88461fecc8feca7730c0926519aa503f29d1b764812681/bitcoin-etl-1.3.1.tar.gz" } ] }