{ "info": { "author": "Diego", "author_email": "dmenin@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "# EthToSql\n\nThis is a package that I built to use on my MSc Dissertation that parses blocks from an Ethereum geth Node, transforms the data into a relational format and inserts them into a relational database. It uses sqlalchemy to interact with the database so it should work with any vendor supported by it. I started the project using Amazon Redshift but given the limited amount of credits I had I moved to Sql Server Express Edition. I can guarantee though that the data collection process works on both.\n\nExample of how to use it: \n\n![Example](images/FullExample.png)\n\n\n## The basics\n\n* When the package is initialized, it will try to connect to the DB using the provided connection string.\n* The package does not create the DB. Make sure it already exists.\n* If the DB is SQL Server, the package then will attempt to create the DB tables if a table named \"Block\" doesn\u00e2\u20ac\u2122t exist.\n* This is not a robust process as it works under the assumption that if table \"block\" doesn\u00e2\u20ac\u2122t exist, none of the other tables exist so it will attempt to create them all.\n* If table block doesn\u00e2\u20ac\u2122t exist but any of the other do, it will fail as it will try to recreate the existing table.\n* You dont have to create the tables as sqlAlchemy will do that on your behalf but the data types won't be the best (it will create a lot of varchar(max))\n\n## How to install it\n\n```\npip install ethToSql\n```\n\n## How to use it\n\nCreate an Object and call any native Geth function:\n\n```python\nimport ethToSql as eTS\n\n###Initial Variables\nhost = 'https://YourHostAddress'\nport = 7546\n#Sql Server localhost connection string to a DB called DBTest\nconString = 'mssql+pyodbc://localhost/{}?driver=SQL+Server+Native+Client+11.0'.format('DBTest') \n\n##Create the object\nets = eTS.ethToSql(host, port, conString)\n\n#Can use any function exposed by the JSON RPC API\n#https://github.com/ethereum/wiki/wiki/JSON-RPC#json-rpc-api\n\n##Example 1: eth_blockNumber function\nprint(\n ets.hexToInt(\n ets.makeRpcRequest('eth_blockNumber', []) #returns in hexa\n )\n)\n#5980493\n\n#Example 2: eth_syncing function\nets.makeRpcRequest('eth_syncing', []) #raw return\n\n#{'currentBlock': '0x5b4301',\n# 'highestBlock': '0x5b4302',\n# 'knownStates': '0x0',\n# 'pulledStates': '0x0',\n# 'startingBlock': '0x5b42ff'}\n\n```\n\nThe package offers hexToInt and intToHex functions to facilitate conversion.\n\n\n\n## Core functions\n\n### Parse Single Block \nThe main function is called \"parseBlock\" and it will, as the name suggests, parse a block given its number\n\n```python\nets.parseBlock(bnum = 3000000, #block number\n alias = 'Test', #an alias that is saved on the Block Table\n getBalanceInfo=0, #Get the ballance of each account that had transactions on that block \n SAVE_TO_DB = True, #whether or not save to the DB\n printAtEnd = 1) #print message at the end\n\n#Finished block 3000000 with: 6 transactions, 8 unique accounts and 1 contracts created\n```\nIt will populate the 4 tables shown on the diagram below:\n![Example](images/ER.png)\n\n\n* The all the Aliases will be \"other\" as the code that populate this table is not available here.\n* A forth table called AccountBalances containing the balance of each account on each block is not being showed.\n* Bear in mind that setting the getBalanceInfo flag to 1 will populate that table by calling the \"eth_getBalance\" function for each account that participated on a transaction in the block, which will drastically increase run time.\n* A full node is necessary to use the getBalanceInfo flag for older blocks due to the state tree pruning that takes place on light nodes.\n\n\n### Run queries\n\nThe database object exposes an execute function that can be used to run any sort of query:\n```python\nprint(ets.seqldb.execute('select * from block'))\nets.seqldb.execute('delete from block')\n```\n\n### Parse Consecutive Range of Blocks\n\nThe parseRange function parses a range of blocks. It checks if the block is already on the DB to avoid parsin it twice\n\n```python\nets.parseRange(1, #start\n 20, #end\n alias = 'Test', getBalanceInfo=0, SAVE_TO_DB = True, printAtEnd = 0)\n```\nSuggest setting printAtEnd = 0 because the parse range uses tqdm to track progress\n\n## Parse an Account\n\nIt is also possible to parse all the blocks in wich an account has transactions in.\nThat is achieved using BeautifulSoup to parse the HTML from etherscam. \nThere is a hard coded limit of 100 transactions to prevent the function from running \"forever\" in case of big accounts, but that can be increased by increasing the \"translimit\" variable.\nThe accout's Alias (that string besides the name) is also captured and inserted\\updated on the Database if the necessary procedure is created. Can disable that by setting the updateAlias to 0.\n\n![Example](images/ParseAccount.png)\n\n\n\n## Clean up methods\n\nThe parseRange and parseAccount methods handles exception, so if a block fail for whatever reason, a row will be inserted into the \"failures\" table.\n\nMost of the times, the failures are due to timeouts, so calling the parse function again should work.\nThere are two functions to do that, one works by block number and the other that by querying the \"failures\" table and calling the former function for each block on the table.\nThe clean up code is pretty self explanatory and a few examples can be seen here:\n![Example](images/cleanup.png)\n\n\n\n## A few extra important details\n\nPlease note that this was not build to be run at \"production\" capacity. There are in fact a lot of \"shortcuts\" that were taken due to the limited amount of time I had to code and the fact that this is a \"one time project\". \nThe process of writing to the database for example is done using the \".to_sql\" function from a pandas data frame, so a lot of (probably unnecessary\t) data frames are created in the process. The tables structure could also have been improved by having account keys on the transaction table for example and the \"AccountAlias\" and Contract table could be the same.\n\nIn terms of code, the main function \"parseBlock\" has no exception handling it it, so it may fail during execution (although unlikely) leaving a block with missing information. I suggest wrapping the \"parseBlock\" call on a try except and outputing the block number that caused the error error to a table called \"failures\" (as the code example given above) since the clean up methids are available. I say that because the code is pretty robust as I've parsed over 600k blocks so the exceptions, if any, will probably be from the host eventually timing out.\n\nForeign keys are not being created automatically. You may want to do that manually if you are planning to download and analyse a lot of data.\n\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/dmenin/EthToSql", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "ethToSql", "package_url": "https://pypi.org/project/ethToSql/", "platform": "", "project_url": "https://pypi.org/project/ethToSql/", "project_urls": { "Homepage": "https://github.com/dmenin/EthToSql" }, "release_url": "https://pypi.org/project/ethToSql/0.4/", "requires_dist": [ "tqdm" ], "requires_python": "", "summary": "Read Data from an Ethereum node and insert into a SQL database", "version": "0.4" }, "last_serial": 4097691, "releases": { "0.3": [ { "comment_text": "", "digests": { "md5": "ca19948f68559a14f7deb40437e3c28d", "sha256": "f046dfed129f6d114ec66efd36efe18c3dfe3c829852d437f62d15b04c6630a6" }, "downloads": -1, "filename": "ethToSql-0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "ca19948f68559a14f7deb40437e3c28d", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 8804, "upload_time": "2018-07-17T17:18:59", "url": "https://files.pythonhosted.org/packages/dc/33/2a7c727c2e6afa24f934cf5f44fddecb57bc034ec55706f333aed40d01cd/ethToSql-0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "108ada9724cb787b808fd88e73d7ce14", "sha256": "c431a87a44fab0107bee0321c209099074b43818678a7c4e9bf32159d8de13b9" }, "downloads": -1, "filename": "ethToSql-0.3.tar.gz", "has_sig": false, "md5_digest": "108ada9724cb787b808fd88e73d7ce14", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8387, "upload_time": "2018-07-17T17:19:00", "url": "https://files.pythonhosted.org/packages/57/41/d473e93373ec42f2950987dd16810b4c3f11f0102d02a992142822c37418/ethToSql-0.3.tar.gz" } ], "0.4": [ { "comment_text": "", "digests": { "md5": "f4d0176ae42691613b42bd7840626abe", "sha256": "3e008e48d5180a11e3b150dcc6d78e3ab2a6840394d609ae71060b28971bc4f9" }, "downloads": -1, "filename": "ethToSql-0.4-py3-none-any.whl", "has_sig": false, "md5_digest": "f4d0176ae42691613b42bd7840626abe", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 10604, "upload_time": "2018-07-24T16:58:06", "url": "https://files.pythonhosted.org/packages/61/ab/6e9428770f3652df18769ca6a52bffe5bd9f94536584d736cae4ab909b73/ethToSql-0.4-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1347f1b3dd7bfb18bedebfda7a97253a", "sha256": "75b74a53d0ce29ccc95dda77377bacf18f01cb35bd3cc3ee6a905da7a318186c" }, "downloads": -1, "filename": "ethToSql-0.4.tar.gz", "has_sig": false, "md5_digest": "1347f1b3dd7bfb18bedebfda7a97253a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9966, "upload_time": "2018-07-24T16:58:07", "url": "https://files.pythonhosted.org/packages/38/6f/c72974207b53b037f910db32cf37395f3a3c1aa5516ce5907f5b905a7749/ethToSql-0.4.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "f4d0176ae42691613b42bd7840626abe", "sha256": "3e008e48d5180a11e3b150dcc6d78e3ab2a6840394d609ae71060b28971bc4f9" }, "downloads": -1, "filename": "ethToSql-0.4-py3-none-any.whl", "has_sig": false, "md5_digest": "f4d0176ae42691613b42bd7840626abe", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 10604, "upload_time": "2018-07-24T16:58:06", "url": "https://files.pythonhosted.org/packages/61/ab/6e9428770f3652df18769ca6a52bffe5bd9f94536584d736cae4ab909b73/ethToSql-0.4-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1347f1b3dd7bfb18bedebfda7a97253a", "sha256": "75b74a53d0ce29ccc95dda77377bacf18f01cb35bd3cc3ee6a905da7a318186c" }, "downloads": -1, "filename": "ethToSql-0.4.tar.gz", "has_sig": false, "md5_digest": "1347f1b3dd7bfb18bedebfda7a97253a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9966, "upload_time": "2018-07-24T16:58:07", "url": "https://files.pythonhosted.org/packages/38/6f/c72974207b53b037f910db32cf37395f3a3c1aa5516ce5907f5b905a7749/ethToSql-0.4.tar.gz" } ] }