{ "info": { "author": "Abraham Othman", "author_email": "abrahamo@wharton.upenn.edu", "bugtrack_url": null, "classifiers": [ "Development Status :: 5 - Production/Stable", "Intended Audience :: Science/Research", "License :: OSI Approved :: BSD License", "Programming Language :: Python :: 2", "Programming Language :: Python :: 3" ], "description": "# HiScore\n\n**HiScore** is a python library for making *scoring functions*, which map objects (vectors of numerical attributes) to scores (a single numerical value). \n\nScores are a way for domain experts to communicate the quality of a complex, multi-faceted object to a broader audience. Scores are ubiquitous; everything from [NFL Quarterbacks](http://en.wikipedia.org/wiki/Passer_rating) to the [walkability of neighborhoods](https://www.walkscore.com/) has a score.\n\n**HiScore** provides a new way for domain experts to quickly create and improve intuitive scoring functions: by using *reference sets*, a set of representative objects that are assigned scores.\n\n## Installation Notes\n\n**HiScore** version 1.6.0 (and higher) supports both Python 2 and Python 3. The package is installable in the typical way, e.g., with `pip install hiscore`.\n\nPlease note that **your installation will fail if you do not have numpy installed already**. This appears to be a failure on the part of one of the required python packages, **ecos**, to properly install from `pip`. Every other required package should be installed automatically when using `pip`.\n\n## Demonstration\n\nWe'll use a highly simplified version of the [World Health Organization safety score for water wells](http://www.ncbi.nlm.nih.gov/pubmed/22717748) to show how to use **HiScore**.\n\n### Setup and Reference Set Creation\n\nLet's start with a water well safety score based on only two attributes:\n\n1. The distance from the nearest latrine, in meters. (\"Distance\")\n2. The size of the water well platform, in square feet. (\"Size\")\n\nObserve that score should be *increasing* in both of these attributes (e.g., for wells of a fixed size, the score does not decrease as the well moves farther from the nearest latrine). **HiScore requires scores to always be increasing or decreasing in each attribute**. (This is typically not restrictive, as attributes usually measure something that is either good or bad.) \n\nTo quickly create an intelligent score, you can start by determining, low, middle, and high values for each attribute and then labeling all the combinations of these values. (Of course, you can label other objects as well!) For our water well score, distance could have a low value of 0m, a middle value of 10m, and a high value of 50m, while size could have a low value of 1 sq.ft., a middle value of 25 sq.ft., and a high value of 100 sq.ft. labeling all combinations of these values could yield the following reference set:\n\nDistance | Size | Score\n--------- | ---- | -----\n0 | 1 | 0\n0 | 25 | 5\n0 | 100 | 10\n10 | 1 | 20\n10 | 25 | 50\n10 | 100 | 60\n50 | 1 | 65\n50 | 25 | 90\n50 | 100 | 100\n\n\n### Creating, Improving, and Querying the Scoring Function\n\nWe can generate a scoring function by calling `hiscore.create` with this reference set:\n\n```python\t\nimport hiscore\n# The objects in the reference set are tuples so they can be hashed into a dict\nreference_set = {(0,1): 0, (0,25): 5, (0,100): 10, (10,1): 20, (10,25): 50, (10,100): 60, (50,1): 65, (50,25): 90, (50,100): 100}\n# [1,1] -> increasing in both attributes\nscore_function = hiscore.create(reference_set, [1,1], minval=0, maxval=100)\n```\n\nThe resulting scoring function interpolates exactly through the reference set (within a small epsilon):\n\n```python\t\nlist(zip(reference_set.keys(), score_function.calculate(reference_set.keys())))\n# Returns [((0, 1), 0.0), ((0, 25), 5.000000037512336), ((0, 100), 10.0), ((10, 1), 20.000000037569762), ((10, 25), 49.999999957685915), ((10, 100), 59.99999995476526), ((50, 1), 65.0), ((50, 25), 90.0), ((50, 100), 100.0)]\n```\n\nWhile producing reasonable estimates for points that are not in the reference set\n\n```python\nimport numpy as np\nnp.round(score_function.calculate([(15,36)]))\n# Returns [56.]\n```\n\nAnd also obeying monotonicity, so that increasing distance or size increases the score\n\n```python\nnp.round(score_function.calculate([(10,5),(10,10),(10,20),(10,35)]))\n# Returns [ 25., 31., 44., 51.]\nnp.round(score_function.calculate([(5,10),(10,10),(20,10),(50,10)]))\n# Returns [ 15., 31., 42., 74.]\n```\n\nOne of the strengths of **HiScore** is that it is easy to fix mis-scored points. Say, for instance, we were unhappy with a well with distance 15 and size 36 scoring a 56. Say we think it should score a 60 instead. We can add that to the reference set and re-create the scoring function.\n\n```python\nreference_set[(15,36)] = 60\nscore_function = hiscore.create(reference_set, [1,1], minval=0, maxval=100)\nscore_function.calculate([(15,36)])\n# Returns [60.]\n```\n\nHere's a three-dimensional figure of the scoring function:\n\n![Demonstration Score Function](http://aothman.wpengine.com/wp-content/uploads/2018/02/score_function_demo_new.png)\n\nObserve that it is monotone increasing along both axes and piecewise linear, but also how it picks up on shape features from the reference set, increasing more steeply with Distance as opposed to Size.\n\n### A More Complex Score\n\nScoring all low, middle, and high attribute combinations can result in having to score an exponential number of objects in a reference set as the number of attributes increases. Instead, you can use **HiScore** to create multi-level trees with sub-scores that percolate their values upwards. This enables the easy creation and control of scores with dozens of attributes.\n\nTo be concrete, let's extend our safety score for water wells to depend on two sub-scores:\n\n*\tSite Location\n\t*\tDistance to nearest latrine in meters\n\t*\tDistance to other nearest pollutant in meters\n*\tPlatform\n\t*\tSize in square feet\n\t*\tIs it damaged, cracked, or eroding away?\n\nGraphically, our water well scoring function will have the following tree shape:\n\n![Demonstration Scoring Tree](http://aothman.wpengine.com/wp-content/uploads/2018/02/tree_score_demo.png)\n\nWe can use **HiScore** by first making scoring functions for the two-subscores, again using the low-middle-high system:\n\n```python\n# Location attributes: distance to latrine (higher=better), distance to other pollutant (higher=better)\t\nlocation_reference_set = {(0,0): 0, (10,0): 5, (50,0): 10, (0,25): 0, (10,25): 50, (50,25): 75, (0,100): 5, (10,100): 70, (50,100): 100}\nlocation_subscore = hiscore.create(location_reference_set, [1,1], minval=0, maxval=100)\n\n# Platform attributes: size in SF (higher=better), damaged (1=true=damaged=bad, 0=false=undamaged=good)\nplatform_reference_set = {(1,1): 0, (1,0): 20, (25,1): 5, (25,0): 60, (100,0): 100, (100,1): 30}\nplatform_subscore = hiscore.create(platform_reference_set, [1,-1], minval=0, maxval=100)\n```\n(Note how the binary attribute of whether the platform is damaged is handled)\n\nWe can then produce a final score by combining these two scores. The following score more heavily weights the location subscore:\n```python\n# Well attributes: location subscore (higher=better), platform subscore (higher=better)\nwell_reference_set = {(0,0): 0, (100,100): 100, (100,0): 80, (0,100): 20, (50,50): 50, (100,50): 95, (50,100): 65, (0,50): 15, (50,0): 35} \nwell_score = hiscore.create(well_reference_set, [1,1], minval=0, maxval=100)\n```\n\nNow we can calculate a full example score. For instance:\n```python\nlatrine_distance = 28.6\nother_pollutant_distance = 39.0\nloc_score = location_subscore.calculate([(latrine_distance, other_pollutant_distance)])\n# loc_score => [59.95]\nplatform_size = 40.2\nis_damaged = True\nplat_score = platform_subscore.calculate([(platform_size,int(is_damaged))])\n# plat_score => [10.97]\ntotal = np.round(well_score.calculate([loc_score+plat_score]))\n# total => [47.]\n```\n\n## API\n\nStart by creating a `HiScoreEngine`:\n\n*\tcreate(reference_set_dict, monotone_relationship, minval=None, maxval=None)\n\t*\treference_set_dict: The reference set, keys are objects (e.g., a list of attributes) and values are scores.\n\t*\tmonotone_relationship: An iterable with entries that are +/- 1. +1 means the score function should be increasing along that attribute, -1 means the score function should be decreasing.\n\t*\tminval, maxval: Floats, the minimum and maximum values for the function.\n\t*\t*Returns a HiScoreEngine object that can be queried for function values.*\n\nOn that `HiScoreEngine`, you can call:\n\n*\tcalculate(xs)\n\t*\txs: An iterable of objects.\n\t*\t*Returns a list of score function evaluations at each of the tuples.*\n\n*\tvalue_bounds(object)\n\t* \tobject: A single object.\n\t* \t*Returns a two-element tuple (minimum value, maximum value) based on other entries in the reference set and the initialized minimum and maximum values, if defined.*\n\n\n## Credits and References\nDevelopment of the theoretical approach of **HiScore** is credited to a collaboration with [Ken Judd](http://www.hoover.org/fellows/kenneth-l-judd).\n\nThe algorithm itself is an extension of the quasi-Kriging technique proposed by Gleb Beliakov in a [2005 paper](http://link.springer.com/article/10.1007/s10543-005-0028-x). I explored a different algorithm for reference-set-based scoring in a [2014 AAAI paper](https://www.aaai.org/ocs/index.php/AAAI/AAAI14/paper/viewFile/8220/8454).", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/aothman/hiscore", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "hiscore", "package_url": "https://pypi.org/project/hiscore/", "platform": "", "project_url": "https://pypi.org/project/hiscore/", "project_urls": { "Homepage": "https://github.com/aothman/hiscore" }, "release_url": "https://pypi.org/project/hiscore/1.6.0/", "requires_dist": null, "requires_python": "", "summary": "A simple and powerful engine for creating scores", "version": "1.6.0" }, "last_serial": 4334435, "releases": { "1.0.0": [ { "comment_text": "", "digests": { "md5": "4df2d3ce3a49e5ed2d0b385b8eccf7a4", "sha256": "0ab1e59ff77f2d28c529164308caf153977734a24a489296467e116263db8cc0" }, "downloads": -1, "filename": "hiscore-1.0.0.tar.gz", "has_sig": false, "md5_digest": "4df2d3ce3a49e5ed2d0b385b8eccf7a4", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8201, "upload_time": "2014-12-27T15:40:35", "url": "https://files.pythonhosted.org/packages/2a/83/dbe9605ef7000fec9ac70a9c6a3d649e2f40135e68c8d1be380422b91958/hiscore-1.0.0.tar.gz" } ], "1.0.1": [ { "comment_text": "", "digests": { "md5": "da1f707a4c5c5d66e2e79344edc163a2", "sha256": "6e52830ddd276c02830c03c5361298752d8801f7417b7c018002a80d848ac977" }, "downloads": -1, "filename": "hiscore-1.0.1.tar.gz", "has_sig": false, "md5_digest": "da1f707a4c5c5d66e2e79344edc163a2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8694, "upload_time": "2014-12-27T15:52:58", "url": "https://files.pythonhosted.org/packages/dd/a5/ff1e7bbf459ae32fe9971b1e0ad1e2a629d9df864dc3e7f1cb0dcb04984c/hiscore-1.0.1.tar.gz" }, { "comment_text": "", "digests": { "md5": "cce934f02389d9ccbaa795e36d1d4e74", "sha256": "58c3bd587f3f88c6e714d1cde07f1cacb6355340a3cc63d539aea5230f990193" }, "downloads": -1, "filename": "hiscore-1.0.1.zip", "has_sig": false, "md5_digest": "cce934f02389d9ccbaa795e36d1d4e74", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 17673, "upload_time": "2014-12-27T15:53:01", "url": "https://files.pythonhosted.org/packages/a8/86/90a33d749818d21b1837a40b2e4fd781738e0e7b0bc20a4c3918f2a9d9e6/hiscore-1.0.1.zip" } ], "1.1.0": [ { "comment_text": "", "digests": { "md5": "8ab67f3542fdfb0d28465e98d914ae5a", "sha256": "b86db9b38da94cc7a7436edde615c953e72b45e33477e0ac6b4d1930c38a260c" }, "downloads": -1, "filename": "hiscore-1.1.0.tar.gz", "has_sig": false, "md5_digest": "8ab67f3542fdfb0d28465e98d914ae5a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8366, "upload_time": "2015-01-28T03:43:08", "url": "https://files.pythonhosted.org/packages/ae/f6/7d4bd8d12258290a290cd1175b1b9bde8bc48ea567bce338316466acdbbb/hiscore-1.1.0.tar.gz" }, { "comment_text": "", "digests": { "md5": "35086b76e9a679a3d3e94687f2cc42ca", "sha256": "ae627a8460ecf2c0456e7d2ae163f2375b5728616133e74143e46300e8920fe2" }, "downloads": -1, "filename": "hiscore-1.1.0.zip", "has_sig": false, "md5_digest": "35086b76e9a679a3d3e94687f2cc42ca", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 17148, "upload_time": "2015-01-28T03:43:10", "url": "https://files.pythonhosted.org/packages/e7/f6/b5cb655e85a2ab850ea024478e7d47360e6ca1cc5de043281078f2530809/hiscore-1.1.0.zip" } ], "1.2.0": [ { "comment_text": "", "digests": { "md5": "015c0052710c3c93a4b7d029037b4559", "sha256": "742ae8127ba7a64586493fa6bac43260463846a86e2b1a7c959ab28f345cd975" }, "downloads": -1, "filename": "hiscore-1.2.0.tar.gz", "has_sig": false, "md5_digest": "015c0052710c3c93a4b7d029037b4559", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8385, "upload_time": "2015-04-03T19:09:36", "url": "https://files.pythonhosted.org/packages/99/64/9a15bf3ee6f06f2e77f1f55fe0958f1c87a84c077f80e70b5388cd170776/hiscore-1.2.0.tar.gz" }, { "comment_text": "", "digests": { "md5": "bb3a576931507bfe77ed2544739f98bf", "sha256": "c6374b689f084164a878ee64285160becfea85e6b1f6868c3c834354e5587b44" }, "downloads": -1, "filename": "hiscore-1.2.0.zip", "has_sig": false, "md5_digest": "bb3a576931507bfe77ed2544739f98bf", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 17161, "upload_time": "2015-04-03T19:09:40", "url": "https://files.pythonhosted.org/packages/47/44/37cd8ac81bb35bbb8b1b0265b9b1b699cda3f4fa4e62ef1a464c5a71d822/hiscore-1.2.0.zip" } ], "1.3.0": [ { "comment_text": "", "digests": { "md5": "19bd7b9436dcf1393dae0d3af09e9ad0", "sha256": "fa2f74819b961446991d480c99a6e4333f1216c14dcbd1636a56c1a4adc5c4dc" }, "downloads": -1, "filename": "hiscore-1.3.0.tar.gz", "has_sig": false, "md5_digest": "19bd7b9436dcf1393dae0d3af09e9ad0", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8426, "upload_time": "2016-07-06T05:01:58", "url": "https://files.pythonhosted.org/packages/cf/b8/b836edbb382498b5e0a370b4063e6a59559859e1b7a1f8490f69a028fb3b/hiscore-1.3.0.tar.gz" }, { "comment_text": "", "digests": { "md5": "3f7b47b35d6a6824b75b9861a23382e6", "sha256": "bc9f07c9ea46fb65234c16a32c493af6aba43f239fd4eb28c846ad3b38fb4e51" }, "downloads": -1, "filename": "hiscore-1.3.0.zip", "has_sig": false, "md5_digest": "3f7b47b35d6a6824b75b9861a23382e6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 17196, "upload_time": "2016-07-06T05:02:03", "url": "https://files.pythonhosted.org/packages/28/bd/8992ce71c7268b4cf473dbac7bc619f6b44b22fcb3cbad6aec8dda48a197/hiscore-1.3.0.zip" } ], "1.6.0": [ { "comment_text": "", "digests": { "md5": "2fc6d6c15711cb85de651c4931cdc39c", "sha256": "5024c7de09e337a333be72e95d44b4342a30bc96194050eb9c08d84a310d7a61" }, "downloads": -1, "filename": "hiscore-1.6.0.tar.gz", "has_sig": false, "md5_digest": "2fc6d6c15711cb85de651c4931cdc39c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8482, "upload_time": "2018-10-02T23:18:18", "url": "https://files.pythonhosted.org/packages/09/a4/8142b6fb75e321a1431d73d205180fe7ada64e0092e582b65f7751bab319/hiscore-1.6.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "2fc6d6c15711cb85de651c4931cdc39c", "sha256": "5024c7de09e337a333be72e95d44b4342a30bc96194050eb9c08d84a310d7a61" }, "downloads": -1, "filename": "hiscore-1.6.0.tar.gz", "has_sig": false, "md5_digest": "2fc6d6c15711cb85de651c4931cdc39c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8482, "upload_time": "2018-10-02T23:18:18", "url": "https://files.pythonhosted.org/packages/09/a4/8142b6fb75e321a1431d73d205180fe7ada64e0092e582b65f7751bab319/hiscore-1.6.0.tar.gz" } ] }