{ "info": { "author": "Austin Nar", "author_email": "austin.nar@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "\nAn Example in Python\n--------------------\n\nLet's again say you have the example logs in the file `accesslog.txt`.\n\n 10.0.0.8 - - [2019-01-01:10:58:12 -0500] \"https://mysite.com/index.html\"\n 173.28.102.33 - - [2019-01-01:10:58:25 -0500] \"https://mysite.com/login\"\n\nWe first define the template.\n\n``` python\ntemplate = '{{ ip ip_address }} - - [{{ date date_time }}] \"{{ url URL }}\"'\n```\n\nWe then need to define our classes. `ip` and `url` are builtins with the package, but dates come in a variety of formats so we must explicitly define ours here. Note you can see all builtins using `default_classes()`\n\n``` python\nimport tabulog, datetime \n\ndate_parser = tabulog.Parser(\n '[0-9]{4}\\\\-[0-9]{2}\\\\-[0-9]{2}:[0-9]{2}:[0-9]{2}:[0-9]{2}[ ][\\\\-\\\\+][0-9]{4}',\n lambda x:datetime.datetime.strptime(x, '%Y-%m-%d:%H:%M:%S %z'),\n name = 'date'\n)\ndate_parser\n```\n\n Parser('[0-9]{4}\\-[0-9]{2}\\-[0-9]{2}:[0-9]{2}:[0-9]{2}:[0-9]{2}[ ][\\-\\+][0-9]{4}', at 0x7f7a07574e18>, 'date')\n\n``` python\nfor key in ['ip', 'url']:\n print(tabulog.default_classes()[key])\n```\n\n Parser('[0-9]{1,3}(\\.[0-9]{1,3}){3}', at 0x7f7a06c896a8>, 'ip')\n Parser('(-|(?:http(s)?:\\/\\/)?[\\w.-]+(?:\\.[\\w\\.-]+)+[\\w\\-\\._~:/?#[\\]@!\\$&\\'\\(\\)\\*\\+,;=.]+)', at 0x7f7a06c896a8>, 'url')\n\nBoth `ip` and `url` require no formatting, so they have the identity function, (`lambda x:x` in python), as their formatter.\n\nTo get our final output in tabular format, we first combine everything into a `Template` object.\n\n``` python\n# We only need to pass our custom date parser class, the defaults will be included.\nT = tabulog.Template(\n template_string = template,\n classes = [date_parser]\n)\nT\n```\n\n Template(\"{{ ip ip_address }} - - [{{ date date_time }}] \\\"{{ url URL }}\\\"\", classes = ...)\n\nNote that we only had to pass our custom class `date`. The builtin classes `ip` and `url` were included by default.\n\nFinally, we can read in our log file, and call the `tabulate` function in our `Template` object. The \nfinal output is a Pandas DataFrame.\n\n``` python\nwith open('accesslog.txt', 'r') as f:\n logs = f.read().split('\\n')[:-1]\n\nT.tabulate(logs)\n```\n\n ip_address date_time URL\n 0 10.0.0.8 2019-01-01 10:58:12-05:00 https://mysite.com/index.html\n 1 173.28.102.33 2019-01-01 10:58:25-05:00 https://mysite.com/login\n\n\nA more elegant and portable way of completing this task would be to define the template and the custom class in the same file, which can be ported to other Tabulog libraries in other languages, leaving only the formatters to be defined in the R script.\n\nFirst, we define the `template` and the `classes` in a yaml file\n\n```\n~$ cat accesslog_template.yml\n```\n\n template: '{{ ip ip_address }} - - [{{ date date_time }}] \"{{ url URL }}\"'\n classes:\n date: '[0-9]{4}\\-[0-9]{2}\\-[0-9]{2}:[0-9]{2}:[0-9]{2}:[0-9]{2}[ ][\\-\\+][0-9]{4}'\n\nNext, we define the formatters for each of our classes. Here we only have one, but we still put it in a named list, with the name matching the name of the class in the template file.\n\n``` python\nformatters = {\n 'date': lambda x:datetime.datetime.strptime(x, '%Y-%m-%d:%H:%M:%S %z')\n}\n```\n\nNext, we make create our template again, this time using the `file` argument.\n\n``` r\nT = tabulog.Template(\n file = 'accesslog_template.yml',\n formatters = formatters\n)\nT\n```\n\n Template(\"{{ ip ip_address }} - - [{{ date date_time }}] \\\"{{ url URL }}\\\"\", classes = ...)\n\nAgain, we get our final output with the same call to `tabulate`.\n\n``` python\nT.tabulate(logs)\n```\n\n ip_address date_time URL\n 0 10.0.0.8 2019-01-01 10:58:12-05:00 https://mysite.com/index.html\n 1 173.28.102.33 2019-01-01 10:58:25-05:00 https://mysite.com/login\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://github.com/austinnar/tabulog", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "tabulog", "package_url": "https://pypi.org/project/tabulog/", "platform": "", "project_url": "https://pypi.org/project/tabulog/", "project_urls": { "Homepage": "http://github.com/austinnar/tabulog" }, "release_url": "https://pypi.org/project/tabulog/0.1.1/", "requires_dist": [ "PyYAML", "pandas" ], "requires_python": "", "summary": "Parsing Semi-Structured Log Files into Tabular Format", "version": "0.1.1" }, "last_serial": 5694912, "releases": { "0.1.1": [ { "comment_text": "", "digests": { "md5": "e1f7fe92127e0e32ea130dfb2fe525f5", "sha256": "1bd1776d163f286bd5d9187827cb7e9473732d2b86a718cae9930873d1920271" }, "downloads": -1, "filename": "tabulog-0.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "e1f7fe92127e0e32ea130dfb2fe525f5", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 5771, "upload_time": "2019-08-18T15:19:49", "url": "https://files.pythonhosted.org/packages/77/c9/8528650af09c1a88a193fae7c3b7b80080b58971e48b01ce0173a52cdf2d/tabulog-0.1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1a388a6ca05e32dc4aa52a74e7ab6409", "sha256": "a41f711672c432835a5c50355b1e2abaa7311d44a53e6703d45544e7edb200e7" }, "downloads": -1, "filename": "tabulog-0.1.1.tar.gz", "has_sig": false, "md5_digest": "1a388a6ca05e32dc4aa52a74e7ab6409", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5235, "upload_time": "2019-08-18T15:19:51", "url": "https://files.pythonhosted.org/packages/0d/b8/7e171179ef39173f2bbde1676be7f2d63070674b51b6da5513284e12712e/tabulog-0.1.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "e1f7fe92127e0e32ea130dfb2fe525f5", "sha256": "1bd1776d163f286bd5d9187827cb7e9473732d2b86a718cae9930873d1920271" }, "downloads": -1, "filename": "tabulog-0.1.1-py3-none-any.whl", "has_sig": false, "md5_digest": "e1f7fe92127e0e32ea130dfb2fe525f5", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 5771, "upload_time": "2019-08-18T15:19:49", "url": "https://files.pythonhosted.org/packages/77/c9/8528650af09c1a88a193fae7c3b7b80080b58971e48b01ce0173a52cdf2d/tabulog-0.1.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1a388a6ca05e32dc4aa52a74e7ab6409", "sha256": "a41f711672c432835a5c50355b1e2abaa7311d44a53e6703d45544e7edb200e7" }, "downloads": -1, "filename": "tabulog-0.1.1.tar.gz", "has_sig": false, "md5_digest": "1a388a6ca05e32dc4aa52a74e7ab6409", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5235, "upload_time": "2019-08-18T15:19:51", "url": "https://files.pythonhosted.org/packages/0d/b8/7e171179ef39173f2bbde1676be7f2d63070674b51b6da5513284e12712e/tabulog-0.1.1.tar.gz" } ] }