{ "info": { "author": "Nenad Popovic", "author_email": "popovic0706@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: GNU General Public License v3 (GPLv3)", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "# pygrouplib\nLibrary for dividing entities into groups based on numeric or text value\n\n\n## Quick start\n```python\nfrom pygrouplib import NumericGrouper, TextGrouper\n\n# Example data list\nemployees = []\nemployees.append({'name':'John','title':'Cardiologist','age':46})\nemployees.append({'name':'Ryan','title':'Cardiology','age':34})\nemployees.append({'name':'Kate','title':'Child Cardiologist', 'age':56})\nemployees.append({'name':'Anna','title':'Neurology', 'age':33})\nemployees.append({'name':'Mike','title':'Neurologist', 'age':38})\n\n# Group by title, ignoring \"Child\" and allowing 1 different character for each 5 characters in title.\ntg = TextGrouper()\ngroups = tg.group(employees, key=lambda x:x['title'], chars_per_error=5, ignore_list=['Child'])\nprint(*groups, sep='\\n')\n\n''' \n[{'name': 'John', 'title': 'Cardiologist', 'age': 46}, {'name': 'Ryan', 'title': 'Cardiology', 'age': 34}, {'name': 'Kate', 'title': 'Child Cardiologist', 'age': 56}]\n[{'name': 'Mike', 'title': 'Neurologist', 'age': 38}, {'name': 'Anna', 'title': 'Neurology', 'age': 33}]\n'''\n\n# Group by age into 3 subgroups\nng = NumericGrouper()\ngroups = ng.group(employees, key=lambda x:x['age'], groups=3)\nprint(*groups, sep=\"\\n\")\n\n'''\n[{'name': 'Anna', 'title': 'Neurology', 'age': 33}, {'name': 'Ryan', 'title': 'Cardiology', 'age': 34}, {'name': 'Mike', 'title': 'Neurologist', 'age': 38}]\n[{'name': 'John', 'title': 'Cardiologist', 'age': 46}]\n[{'name': 'Kate', 'title': 'Child Cardiologist', 'age': 56}]\n'''\n\n\n```\n\n\n## Installation\nPygrouplib is published through PyPi so you can install it with `easy_install` or `pip`. The package name is `pygrouplib`, and the same package works on Python 2 and Python 3. Make sure you use the right version of `pip` or `easy_install` for your Python version (these may be named `pip3` and `easy_install3` respectively if you\u2019re using Python 3).\n```bash\n$ easy_install pygrouplib\n```\n```\n$ pip install pygrouplib\n```\n\n\n## Documentation\n\n\n### NumericGrouper\n#### group()\n- Groups elements into soubgroups based on numeric value.\n- Arguments:\n - **entities** - List of entities to be divided into groups.\n - **groups** - Number of resulting subgroups. The default value is None, in which case it is calculated based on provided values from entities.\n - **key** Function of one argument that is used to extract comparison key from each element in iterable (for example, `key=lambda x: x['value']`). The default value is None (compare the elements directly).\n- Returns a list of entities grouped into lists.\n\n### TextGrouper\n#### group()\n- Groups elements into soubgroups based on text value. Similarity is calculated using Levenshtein algorithm.\n- Arguments:\n - **entities** - List of elements to be divided into groups.\n - **similarity_limit** - Maximum Levenshtein distance between words to be consiedered as similar. The default value is calculated as 1 + (1 for each chars_per_error characters).\n - **chars_per_error** - Number of characters per 1 error allowed. Levenshtein distance is considered as a number of errors. The default value is 8 (1 error is allowed for each 8 characters). \n - **ignore_list** - List of patterns to ignore when calculating text similarity. For example, with `ignore_list=['\\\\d']`, *'word123'* and *'123word45'* are considered equal.\n - **key** - Function of one argument that is used to extract comparison key from each element in iterable (for example, `key=str.lower`). The default value is None (compare the elements directly).\n- Returns a list of entities grouped into lists.\n#### levenshtein_distance()\n- Calculates Leveshtein distance between two strings. Levenshtein distance between two words is the minimum number of single-character edits (i.e. insertions, deletions, or substitutions) required to change one word into the other. Comparison is case sensitive.\n- Arguments:\n - **s1**, **s2** - Strings to be compared. Leading and trailing spaces are ignored.\n - **ignore_list** - List of patterns to be ignored when comparing strings. For example, with `ignore_list=['\\\\d']`, distance between *'word123'* and *'123word45'* is 0. Default value is empty list.", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/popovicn/pygrouplib", "keywords": "group fuzzy approximate levenshtein text numeric", "license": "", "maintainer": "", "maintainer_email": "", "name": "pygrouplib", "package_url": "https://pypi.org/project/pygrouplib/", "platform": "", "project_url": "https://pypi.org/project/pygrouplib/", "project_urls": { "Homepage": "https://github.com/popovicn/pygrouplib" }, "release_url": "https://pypi.org/project/pygrouplib/1.0.3/", "requires_dist": null, "requires_python": ">=2.7", "summary": "Library for dividing entities into groups based on numeric or text value.", "version": "1.0.3" }, "last_serial": 5931614, "releases": { "1.0.2": [ { "comment_text": "", "digests": { "md5": "4fe9e962a00b8f070acf65d736cb0133", "sha256": "09d7b04f8c6bfb7b8cc73130d1bce080e1cee932245c87912b708e7fe4cb2cc7" }, "downloads": -1, "filename": "pygrouplib-1.0.2.tar.gz", "has_sig": false, "md5_digest": "4fe9e962a00b8f070acf65d736cb0133", "packagetype": "sdist", "python_version": "source", "requires_python": ">=2.7", "size": 4620, "upload_time": "2019-09-30T16:02:51", "url": "https://files.pythonhosted.org/packages/a8/f6/f332ec7b4d7336e62436f184a0f1e0a6de4cc13fc79519b2e6375ba9e9b4/pygrouplib-1.0.2.tar.gz" } ], "1.0.3": [ { "comment_text": "", "digests": { "md5": "ec335729b7aa3c390dbd2aba9049ed3b", "sha256": "c7eebab0d428b4136864edd9cff852230cda0f0d5797935cff936ab1cb95f4e7" }, "downloads": -1, "filename": "pygrouplib-1.0.3.tar.gz", "has_sig": false, "md5_digest": "ec335729b7aa3c390dbd2aba9049ed3b", "packagetype": "sdist", "python_version": "source", "requires_python": ">=2.7", "size": 5408, "upload_time": "2019-10-05T10:15:55", "url": "https://files.pythonhosted.org/packages/84/ae/2b7c6ccca8f79dde88e6a5adc75162ea4407f43324e2dd12975cdcea8311/pygrouplib-1.0.3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "ec335729b7aa3c390dbd2aba9049ed3b", "sha256": "c7eebab0d428b4136864edd9cff852230cda0f0d5797935cff936ab1cb95f4e7" }, "downloads": -1, "filename": "pygrouplib-1.0.3.tar.gz", "has_sig": false, "md5_digest": "ec335729b7aa3c390dbd2aba9049ed3b", "packagetype": "sdist", "python_version": "source", "requires_python": ">=2.7", "size": 5408, "upload_time": "2019-10-05T10:15:55", "url": "https://files.pythonhosted.org/packages/84/ae/2b7c6ccca8f79dde88e6a5adc75162ea4407f43324e2dd12975cdcea8311/pygrouplib-1.0.3.tar.gz" } ] }