{ "info": { "author": "David J. Bianco", "author_email": "david.bianco@target.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.7" ], "description": "# HuntLib\nA Python library to help with some common threat hunting data analysis operations\n\n[![Target\u2019s CFC-Open-Source Slack](https://cfc-slack-inv.herokuapp.com/badge.svg?colorA=155799&colorB=159953)](https://cfc-slack-inv.herokuapp.com/)\n\n## What's Here?\nThe `huntlib` module provides two major object classes as well as a few convenience functions. \n\n* **ElasticDF**: Search Elastic and return results as a Pandas DataFrame\n* **SplunkDF**: Search Splunk and return results as a Pandas DataFrame\n* **entropy()** / **entropy_per_byte()**: Calculate Shannon entropy\n* **promptCreds()**: Prompt for login credentials in the terminal or from within a Jupyter notebook.\n* **edit_distance()**: Calculate how \"different\" two strings are from each other\n\n## huntlib.elastic.ElasticDF\nThe `ElasticDF()` class searches Elastic and returns results as a Pandas DataFrame. This makes it easier to work with the search results using standard data analysis techniques.\n\n### Example usage:\n\nCreate a plaintext connection to the Elastic server, no authentication\n\n```python\ne = ElasticDF(\n url=\"http://localhost:9200\"\n)\n```\n\nThe same, but with SSL and authentication\n\n```python\ne = ElasticDF(\n url=\"https://localhost:9200\",\n ssl=True,\n username=\"myuser\",\n password=\"mypass\"\n)\n```\nFetch search results from an index or index pattern for the previous day\n\n```python\ndf = e.search_df(\n lucene=\"item:5282 AND color:red\",\n index=\"myindex-*\",\n days=1\n)\n```\n\nThe same, but do not flatten structures into individual columns. This will result in each structure having a single column with a JSON string describing the structure.\n\n```python\ndf = e.search_df(\n lucene=\"item:5282 AND color:red\",\n index=\"myindex-*\",\n days=1,\n normalize=False\n)\n```\n\nA more complex example, showing how to set the Elastic document type, use Python-style datetime objects to constrain the search to a certain time period, and a user-defined field against which to do the time comparisons. The result size will be limited to no more than 1500 entries.\n\n```python\ndf = e.search_df(\n lucene=\"item:5285 AND color:red\",\n index=\"myindex-*\",\n doctype=\"doc\", date_field=\"mydate\",\n start_time=datetime.now() - timedelta(days=8),\n end_time=datetime.now() - timedelta(days=6),\n limit=1500\n)\n```\n\nThe `search` and `search_df` methods will raise `InvalidRequestSearchException`\nin the event that the search request is syntactically correct but is otherwise\ninvalid. For example, if you request more results be returned than the server\nis able to provide. They will raise `AuthenticationErrorSearchException` in the\nevent the server denied the credentials during login. They can also raise an\n`UnknownSearchException` for other situations, in which case the exception\nmessage will contain the original error message returned by Elastic so you\ncan figure out what went wrong.\n\n## huntlib.splunk.SplunkDF\n\nThe `SplunkDF` class search Splunk and returns the results as a Pandas DataFrame. This makes it easier to work with the search results using standard data analysis techniques.\n\n### Example Usage\n\nEstablish an connection to the Splunk server. Whether this is SSL/TLS or not depends on the server, and you don't really get a say.\n\n```python\ns = SplunkDF(\n host=splunk_server,\n username=\"myuser\",\n password=\"mypass\"\n)\n```\n\nFetch all search results across all time\n\n```python\ndf = s.search_df(\n spl=\"search index=win_events EventCode=4688\"\n)\n```\n\nFetch only specific fields, still across all time\n\n```python\ndf = s.search_df(\n spl=\"search index=win_events EventCode=4688 | table ComputerName _time New_Process_Name Account_Name Creator_Process_ID New_Process_ID Process_Command_Line\"\n)\n```\n\nTime bounded search, 2 days prior to now\n\n```python\ndf = s.search_df(\n spl=\"search index=win_events EventCode=4688\",\n days=2\n)\n```\n\nTime bounded search using Python datetime() values\n\n```python\ndf = s.search_df(\n spl=\"search index=win_events EventCode=4688\",\n start_time=datetime.now() - timedelta(days=2),\n end_time=datetime.now()\n)\n```\n\nTime bounded search using Splunk notation\n\n```python\ndf = s.search_df(\n spl=\"search index=win_events EventCode=4688\",\n start_time=\"-2d@d\",\n end_time=\"@d\"\n)\n```\n\nLimit the number of results returned to no more than 1500\n\n```python\ndf = s.search_df(\n spl=\"search index=win_events EventCode=4688\",\n limit=1500\n)\n```\n\n*NOTE: The value specified as the `limit` is also subject to a server-side max\nvalue. By default, this is 50000 and can be changed by editing limits.conf on\nthe Splunk server. If you use the limit parameter, the number of search results\nyou receive will be the lesser of the following values: 1) the actual number of\nresults available, 2) the number you asked for with `limit`, 3) the server-side\nmaximum result size. If you omit limit altogether, you will get the **true**\nnumber of search results available without subject to additional limits, though\nyour search may take much longer to complete.*\n\n`SplunkDF` will raise `AuthenticationErrorSearchException` during initialization\nin the event the server denied the supplied credentials. \n\n## Miscellaneous Functions\n\n### Entropy\n\nWe define two entropy functions, `entropy()` and `entropy_per_byte()`. Both accept a single string as a parameter. The `entropy()` function calculates the Shannon entropy of the given string, while `entropy_per_byte()` attempts to normalize across strings of various lengths by returning the Shannon entropy divided by the length of the string. Both return values are `float`.\n\n```python\n>>> entropy(\"The quick brown fox jumped over the lazy dog.\")\n4.425186429663008\n>>> entropy_per_byte(\"The quick brown fox jumped over the lazy dog.\")\n0.09833747621473352\n```\n\nThe higher the value, the more data potentially embedded in it.\n\n### Credential Handling\n\nSometimes you need to provide credentials for a service, but don't want to hard-code them into your scripts, especially if you're collaborating on a hunt. `huntlib` provides the `promptCreds()` function to help with this. This function works well both in the terminal and when called from within a Jupyter notebook.\n\nCall it like so:\n\n```python\n(username, password) = promptCreds()\n```\n\nYou can change one or both of the username/password prompts by passing arguments:\n\n```python\n(username, password) = promptCreds(uprompt=\"LAN ID: \",\n pprompt=\"LAN Pass: \")\n```\n\n### String Similarity\n\nString similarity can be expressed in terms of \"edit distance\", or the number of single-character edits necessary to turn the first string into the second string. This is often useful when, for example, you want to find two strings that very similar but not identical (such as when hunting for [process impersonation](http://detect-respond.blogspot.com/2016/11/hunting-for-malware-critical-process.html)).\n\nThere are a number of different ways to compute similarity. `huntlib` provides the `edit_distance()` function for this, which supports several algorithms:\n\n* [Levenshtein Distance](https://en.wikipedia.org/wiki/Levenshtein_distance)\n* [Damerau-Levenshtein Distance](https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance)\n* [Hamming Distance](https://en.wikipedia.org/wiki/Hamming_distance)\n* [Jaro Distance](https://en.wikipedia.org/wiki/Jaro%E2%80%93Winkler_distance)\n* [Jaro-Winkler Distance](https://en.wikipedia.org/wiki/Jaro%E2%80%93Winkler_distance)\n\nHere's an example:\n\n```python\n>>> huntlib.edit_distance('svchost', 'scvhost')\n1\n```\n\nYou can specify a different algorithm using the `method` parameter. Valid methods are `levenshtein`, `damerau-levenshtein`, `hamming`, `jaro` and `jaro-winkler`. The default is `damerau-levenshtein`.\n\n```python\n>>> huntlib.edit_distance('svchost', 'scvhost', method='levenshtein')\n2\n```\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/target/huntlib", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "huntlib", "package_url": "https://pypi.org/project/huntlib/", "platform": "", "project_url": "https://pypi.org/project/huntlib/", "project_urls": { "Homepage": "https://github.com/target/huntlib" }, "release_url": "https://pypi.org/project/huntlib/0.3.0/", "requires_dist": [ "future", "splunk-sdk", "elasticsearch-dsl", "pandas", "numpy", "jellyfish (<0.7)" ], "requires_python": "", "summary": "A Python library to help with some common threat hunting data analysis operations", "version": "0.3.0" }, "last_serial": 4407006, "releases": { "0.2.1": [ { "comment_text": "", "digests": { "md5": "d34d3c628fe4c52dd73657d475e17ccf", "sha256": "4e08181949e2467bddb2cc80b14e22d861081deb797f2c3205133eb6bbe1ae31" }, "downloads": -1, "filename": "huntlib-0.2.1-py3-none-any.whl", "has_sig": false, "md5_digest": "d34d3c628fe4c52dd73657d475e17ccf", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 10108, "upload_time": "2018-10-04T00:56:20", "url": "https://files.pythonhosted.org/packages/dd/91/6c4bcf2466adfca2ae1b1981bf352cfc6ebf860c5b1056fd9c81cde083d8/huntlib-0.2.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "e23bc1980ab6fd55e9cb8730aed26dbe", "sha256": "4033a5abeea42308f804455122754db083a17a6029a4bdb5139f2d592b217ce7" }, "downloads": -1, "filename": "huntlib-0.2.1.tar.gz", "has_sig": false, "md5_digest": "e23bc1980ab6fd55e9cb8730aed26dbe", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7965, "upload_time": "2018-10-04T00:56:22", "url": "https://files.pythonhosted.org/packages/e1/4c/e8d310a256ddd6d99d29c76f845726d0115799c5c479e6423e64c5683a07/huntlib-0.2.1.tar.gz" } ], "0.3.0": [ { "comment_text": "", "digests": { "md5": "72cae8423a29ea61c5de2641600f5692", "sha256": "24e76fd34ef9d4bbcbeb2ef4bcb7267cc84609dcbacb5198d8637ca5f8ae36f6" }, "downloads": -1, "filename": "huntlib-0.3.0-py3-none-any.whl", "has_sig": false, "md5_digest": "72cae8423a29ea61c5de2641600f5692", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 12200, "upload_time": "2018-10-23T15:55:47", "url": "https://files.pythonhosted.org/packages/b4/d7/037718d31f9fae3869de4e0b9b8c56f19af1036527de29ed3fd97bfcb590/huntlib-0.3.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "2b72fcb5b5647c7373d152e16117c9ae", "sha256": "d81fb7d4768730de425f23e0cc0eff1afa141d7630e72122ce8c466a777124c1" }, "downloads": -1, "filename": "huntlib-0.3.0.tar.gz", "has_sig": false, "md5_digest": "2b72fcb5b5647c7373d152e16117c9ae", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11173, "upload_time": "2018-10-23T15:55:48", "url": "https://files.pythonhosted.org/packages/05/37/b0eede1f9bcf016766b0a3a1fd1a43c58cfcabfd264e77d9e0e2b59b07b6/huntlib-0.3.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "72cae8423a29ea61c5de2641600f5692", "sha256": "24e76fd34ef9d4bbcbeb2ef4bcb7267cc84609dcbacb5198d8637ca5f8ae36f6" }, "downloads": -1, "filename": "huntlib-0.3.0-py3-none-any.whl", "has_sig": false, "md5_digest": "72cae8423a29ea61c5de2641600f5692", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 12200, "upload_time": "2018-10-23T15:55:47", "url": "https://files.pythonhosted.org/packages/b4/d7/037718d31f9fae3869de4e0b9b8c56f19af1036527de29ed3fd97bfcb590/huntlib-0.3.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "2b72fcb5b5647c7373d152e16117c9ae", "sha256": "d81fb7d4768730de425f23e0cc0eff1afa141d7630e72122ce8c466a777124c1" }, "downloads": -1, "filename": "huntlib-0.3.0.tar.gz", "has_sig": false, "md5_digest": "2b72fcb5b5647c7373d152e16117c9ae", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11173, "upload_time": "2018-10-23T15:55:48", "url": "https://files.pythonhosted.org/packages/05/37/b0eede1f9bcf016766b0a3a1fd1a43c58cfcabfd264e77d9e0e2b59b07b6/huntlib-0.3.0.tar.gz" } ] }