{ "info": { "author": "Leif Azzopardi, Paul Thomas, Alistair Moffat", "author_email": "leifos@acm.org, pathom@microsoft.com, ammoffat@unimelb.edu.au", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Science/Research", "License :: OSI Approved :: MIT License", "Programming Language :: Python :: 3 :: Only", "Topic :: Scientific/Engineering :: Information Analysis" ], "description": "# C/W/L Evaluation Script\nAn evaluation script based on the C/W/L framework\nthat is TREC Compatible and provides a replacement\nfor INST_EVAL, RBP_EVAL, TBG_EVAL, UMeasure, TREC_EVAL.\n\n\n## Install\n\nInstall either via `pip install cwl-eval` or ``git clone https://github.com/ireval/cwl.git``.\n`cwl-eval` requires Python 3 and Numpy.\n\n\n## Usage\n\nOnce you have installed the C/W/L Evaluation Framework using `pip install`, you should be able to use the `cwl-eval` as shown below.\nIf you have used `git clone` to install the framework, then you will need to run `cwl_eval.py` directly.\n\n Usage: cwl-eval -c -m -b \n\n Usage: cwl-eval -c -m \n\n Usage: cwl-eval \n\n Usage: cwl-eval -h\n\n- : A TREC Formatted Qrel File with relevance scores used as gains (float)\n Four column tab/space sep file with fields: topic_id unused doc_id gain\n\n- : Costs associated with element type\n\n- : If not specified, costs default to one for all elements\n Two column tab/space sep file with fields: element_type element_cost\n\n- : A TREC Formatted Result File\n Six column tab/space sep file with fields: topic_id element_type doc_id rank score run_id\n\n- : The list of metrics that are to be reported\n If not specified, a set of default metrics will be reported\n Tab/space sep file with fields: metric_name params\n\n- : Specify this file if you would like the BibTeX associated with the measures specified to be\n output to a file called \n\n- -n: Add -n flag to output column names (e.g. Topic, Metric, EU, ETU, EC, ETC, ED)\n\n- -r: Add -r flag to also output residuals for each measurement.\n\n\n\n**Example without using a cost file**\nWhen no costs are specified the cost per item is assumed to be 1.0, and EC and I will be equal.\n\n cwl-eval qrel_file result_file\n\n\n\n**Example with using a cost file**\n\n cwl-eval qrel_file result_file -c cost_file\n\n\n\n**Output**\nA seven column tab/space separated file that contains:\n\n- Topic ID\n- Metric Name\n- Expected Utility Per Item (EU)\n- Expected Utility (ETU)\n- Expected Cost per Item (EC)\n- Expected Cost (ETC)\n- Expected Depth (ED)\n\nIf the `-r` flag is included, then another five columns will be included: ResEU, ResETU, ResEC, ResETC, ResED. \nThese report the residual values for each of the measures (i.e. the difference between the best case and worse case for un-judged items).\n\n\n\nCWL Citation\n------------\nPlease consider citing the following paper when using our code for your evaluations:\n\n @inproceedings{azzopardi2019cwl,\n author = {Azzopardi, Leif and Thomas, Paul and Moffat, Alistair},\n title = {cwl\\_eval: An Evaluation Tool for Information Retrieval},\n booktitle = {Proc. of the 42nd International ACM SIGIR Conference},\n series = {SIGIR '19},\n year = {2019}\n } \n\n\n\nMetrics within CWL EVAL\n-----------------------\nFor each of the metrics provided in cwl_eval.py, the user model for each\nmeasure has been extracted and encoded within the C/W/L framework.\n\nAll weightings have been converted to probabilities.\n\nAs a result, all metrics report a series of values (not a single value):\n - Expected Utility per item examined (EU),\n - Expected Total Utility (ETU),\n - Expected Cost per item examined (EC),\n - Expected Total Cost (ETC)\n - Expected number of items to be examined i.e expected depth (ED)\n\nAll the values are related, such that:\n\nETU = EU * ED\n\nand\n\nETC = EC * ED\n\nIf the cost per item is 1.0, then the expected cost per item is 1.0,\nand the expected cost EC will be equal to I.\n\nCosts can be specified in whatever unit is desired. i.e seconds, characters, words, etc.\n\n\n**List of Metrics**\n\n- RR - Reciprocal Rank\n- P@k - Precision At k\n- AP - Average Precision\n- RBP - Rank Biased Precision\n- INST T\n- INSQ T\n- NDCG@k - Normalized Discounted Cumulative Gain at k\n- BPM-Static - Bejewelled Player Model - Static\n- BPM-Dynamic - Bejewelled Player Model - Dynamic\n- UMeasure - U-Measure\n- TBG - Time Biased Gain\n- IFT-C1 - Information Foraging Theory (Goal)\n- IFT-C2 - Information Foraging Theory (Rate)\n- IFT-C1-C2 - Information Foraging Theory (Goal and Rate)\n\n\n\n**Sample Output from cwl_eval.py where costs per item = 1.0**\n\n cwl-eval qrel_file result_file\n\n| Topic| Metric | EU | ETU | EC | ETC | ED |\n|------|---------------------------------------------------|-------|-------|-------|--------|--------|\n| T1 | P@20 | 0.150 | 3.000 | 1.000 | 20.000 | 20.000 |\n| T1 | P@10 | 0.300 | 3.000 | 1.000 | 10.000 | 10.000 |\n| T1 | P@5 | 0.360 | 1.800 | 1.000 | 5.000 | 5.000 |\n| T1 | P@1 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |\n| T1 | RBP@0.5 | 0.566 | 1.132 | 1.000 | 2.000 | 2.000 |\n| T1 | RBP@0.9 | 0.214 | 2.136 | 1.000 | 10.000 | 10.000 |\n| T1 | SDCG-k@10 | 0.380 | 1.726 | 1.000 | 4.544 | 4.544 |\n| T1 | SDCG-k@5 | 0.461 | 1.358 | 1.000 | 2.948 | 2.948 |\n| T1 | RR | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |\n| T1 | AP | 0.397 | 1.907 | 1.000 | 4.800 | 4.800 |\n| T1 | INST-T=2 | 0.401 | 1.303 | 1.000 | 3.242 | 3.247 |\n| T1 | INST-T=1 | 0.680 | 1.071 | 1.000 | 1.574 | 1.575 |\n| T1 | INSQ-T=2 | 0.316 | 1.428 | 1.000 | 4.509 | 4.525 |\n| T1 | INSQ-T=1 | 0.465 | 1.198 | 1.000 | 2.572 | 2.576 |\n| T1 | BPM-Static-T=1-K=1000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |\n| T1 | BPM-Static-T=1000-K=10 | 0.300 | 3.000 | 1.000 | 10.000 | 10.000 |\n| T1 | BPM-Static-T=1.2-K=10 | 0.400 | 1.200 | 1.000 | 3.000 | 3.000 |\n| T1 | BPM-Dynamic-T=1-K=1000-hb=1.0-hc=1.0 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |\n| T1 | BPM-Dynamic-T=1000-K=10-hb=1.0-hc=1.0 | 0.300 | 3.000 | 1.000 | 10.000 | 10.000 |\n| T1 | BPM-Dynamic-T=1.2-K=10-hb=1.0-hc=1.0 | 0.400 | 1.200 | 1.000 | 3.000 | 3.000 |\n| T1 | U-L@50 | 0.109 | 2.772 | 1.000 | 25.500 | 25.500 |\n| T1 | U-L@10 | 0.338 | 1.860 | 1.000 | 5.500 | 5.500 |\n| T1 | TBG-H@22 | 0.083 | 2.676 | 1.000 | 32.242 | 32.242 |\n| T1 | IFT-C1-T@2.0-b1@0.9-R1@1 | 0.456 | 1.323 | 1.000 | 2.903 | 2.903 |\n| T1 | IFT-C1-T@2.0-b1@0.9-R1@10 | 0.308 | 2.078 | 1.000 | 6.738 | 6.738 |\n| T1 | IFT-C1-T@2.0-b1@0.9-R1@100 | 0.289 | 2.224 | 1.000 | 7.698 | 7.698 |\n| T1 | IFT-C2-A@0.2-b2@0.9-R2@1 | 0.463 | 1.255 | 1.000 | 2.711 | 2.711 |\n| T1 | IFT-C2-A@0.2-b2@0.9-R2@10 | 0.293 | 2.040 | 1.000 | 6.965 | 6.965 |\n| T1 | IFT-C2-A@0.2-b2@0.9-R2@100 | 0.197 | 2.994 | 1.000 | 15.208 | 15.208 |\n| T1 | IFT-C1-C2-T@2.0-b1@0.9-R1@10-A@2.0-b2@0.9-R2@10 | 0.329 | 1.804 | 1.000 | 5.487 | 5.487 |\n| T1 | IFT-C1-C2-T@2.0-b1@0.9-R1@100-A@2.0-b2@0.9-R2@100 | 0.289 | 2.223 | 1.000 | 7.697 | 7.697 |\n\n\n**Sample Output from cwl-eval where costs are set based on cost_file**\n\n cwl-eval qrel_file result_file -c cost_file\n\n| Topic| Metric | EU | ETU | EC | ETC | ED |\n|------|---------------------------------------------------|-------|-------|-------|--------|--------|\n| T1 | P@20 | 0.150 | 3.000 | 1.650 | 33.000 | 20.000 |\n| T1 | P@10 | 0.300 | 3.000 | 2.300 | 23.000 | 10.000 |\n| T1 | P@5 | 0.360 | 1.800 | 2.400 | 12.000 | 5.000 |\n| T1 | P@1 | 1.000 | 1.000 | 2.000 | 2.000 | 1.000 |\n| T1 | RBP@0.5 | 0.566 | 1.132 | 1.951 | 3.902 | 2.000 |\n| T1 | RBP@0.9 | 0.214 | 2.136 | 1.776 | 17.765 | 10.000 |\n| T1 | SDCG-k@10 | 0.380 | 1.726 | 2.188 | 9.943 | 4.544 |\n| T1 | SDCG-k@5 | 0.461 | 1.358 | 2.224 | 6.557 | 2.948 |\n| T1 | RR | 1.000 | 1.000 | 2.000 | 2.000 | 1.000 |\n| T1 | AP | 0.397 | 1.907 | 1.958 | 9.400 | 4.800 |\n| T1 | INST-T=2 | 0.401 | 1.303 | 1.884 | 6.113 | 3.247 |\n| T1 | INST-T=1 | 0.680 | 1.071 | 1.955 | 3.077 | 1.575 |\n| T1 | INSQ-T=2 | 0.316 | 1.428 | 1.799 | 8.125 | 4.525 |\n| T1 | INSQ-T=1 | 0.465 | 1.198 | 1.887 | 4.855 | 2.576 |\n| T1 | BPM-Static-T=1-K=1000 | 1.000 | 1.000 | 2.000 | 2.000 | 1.000 |\n| T1 | BPM-Static-T=1000-K=10 | 0.360 | 1.800 | 2.400 | 12.000 | 5.000 |\n| T1 | BPM-Static-T=1.2-K=10 | 0.400 | 1.200 | 1.667 | 5.000 | 3.000 |\n| T1 | BPM-Dynamic-T=1-K=1000-hb=1.0-hc=1.0 | 1.000 | 1.000 | 2.000 | 2.000 | 1.000 |\n| T1 | BPM-Dynamic-T=1000-K=10-hb=1.0-hc=1.0 | 0.360 | 1.800 | 2.400 | 12.000 | 5.000 |\n| T1 | BPM-Dynamic-T=1.2-K=10-hb=1.0-hc=1.0 | 0.400 | 1.200 | 1.667 | 5.000 | 3.000 |\n| T1 | U-L@50 | 0.162 | 2.552 | 1.654 | 26.000 | 15.720 |\n| T1 | U-L@10 | 0.444 | 1.420 | 2.094 | 6.700 | 3.200 |\n| T1 | TBG-H@22 | 0.143 | 2.339 | 2.046 | 33.508 | 16.375 |\n| T1 | IFT-C1-T@2.0-b1@0.9-R1@1 | 0.456 | 1.323 | 1.971 | 5.723 | 2.903 |\n| T1 | IFT-C1-T@2.0-b1@0.9-R1@10 | 0.308 | 2.078 | 2.080 | 14.017 | 6.738 |\n| T1 | IFT-C1-T@2.0-b1@0.9-R1@100 | 0.289 | 2.224 | 2.068 | 15.922 | 7.698 |\n| T1 | IFT-C2-A@0.2-b2@0.9-R2@1 | 0.516 | 1.180 | 1.958 | 4.481 | 2.289 |\n| T1 | IFT-C2-A@0.2-b2@0.9-R2@10 | 0.404 | 1.368 | 2.011 | 6.802 | 3.382 |\n| T1 | IFT-C2-A@0.2-b2@0.9-R2@100 | 0.360 | 1.786 | 2.388 | 11.832 | 4.954 |\n| T1 | IFT-C1-C2-T@2.0-b1@0.9-R1@10-A@2.0-b2@0.9-R2@10 | 0.413 | 1.361 | 1.990 | 6.552 | 3.293 |\n| T1 | IFT-C1-C2-T@2.0-b1@0.9-R1@100-A@2.0-b2@0.9-R2@100 | 0.360 | 1.786 | 2.388 | 11.832 | 4.954 |\n\n\n**Using the metrics_file to specify the metrics**\n\n cwl-eval qrel_file result_file -m metrics_file\n\nif a metrics_file is not specified, CWL Eval will default to a set of metrics\ndefined in `ruler/measures/cwl_ruler.py`\n\nIf the metrics_file is specified, CWL Eval will instantiate and use the metrics listed.\nAn example test_metrics_file is provided, which includes the following:\n\n PrecisionCWLMetric(1)\n PrecisionCWLMetric(5)\n PrecisionCWLMetric(10)\n PrecisionCWLMetric(20)\n RBPCWLMetric(0.9)\n NDCGCWLMetric(10)\n RRCWLMetric()\n APCWLMetric()\n INSTCWLMetric(1)\n INSQCWLMetric(1)\n BPMCWLMetric(1,1000)\n BPMCWLMetric(1000,10)\n BPMCWLMetric(1.2,10)\n BPMDCWLMetric(1,1000)\n BPMDCWLMetric(1000,10)\n BPMDCWLMetric(1.2,10)\n UMeasureCWLMetric(50)\n UMeasureCWLMetric(10)\n TBGCWLMetric(22)\n IFTGoalCWLMetric(2.0, 0.9, 1)\n IFTGoalCWLMetric(2.0, 0.9, 10)\n IFTGoalCWLMetric(2.0, 0.9, 100)\n IFTRateCWLMetric(0.2, 0.9, 1)\n IFTRateCWLMetric(0.2, 0.9, 10)\n IFTRateCWLMetric(0.2, 0.9, 100)\n IFTGoalRateCWLMetric(2.0,0.9,10, 0.2, 0.9, 10)\n IFTGoalRateCWLMetric(2.0,0.9,100, 0.2, 0.9, 100)\n\nTo specify which metric you desire, inspect the metrics classes in `ruler/measures/`\nto see what metrics are available, and how the parameterize them.\n\nFor example if you wanted Precision Based Measures then you can list them as follows:\n\n PrecisionCWLMetric(1)\n PrecisionCWLMetric(2)\n PrecisionCWLMetric(3)\n PrecisionCWLMetric(4)\n PrecisionCWLMetric(5)\n PrecisionCWLMetric(6)\n PrecisionCWLMetric(7)\n PrecisionCWLMetric(8)\n PrecisionCWLMetric(9)\n PrecisionCWLMetric(10)\n PrecisionCWLMetric(11)\n PrecisionCWLMetric(12)\n PrecisionCWLMetric(13)\n PrecisionCWLMetric(14)\n PrecisionCWLMetric(15)\n PrecisionCWLMetric(16)\n PrecisionCWLMetric(17)\n PrecisionCWLMetric(18)\n PrecisionCWLMetric(19)\n PrecisionCWLMetric(20)\n\nWhile if you wanted Rank Biased Precision Measures, then you can vary the patience parameter:\n\n RBPCWLMetric(0.1)\n RBPCWLMetric(0.2)\n RBPCWLMetric(0.3)\n RBPCWLMetric(0.4)\n RBPCWLMetric(0.5)\n RBPCWLMetric(0.6)\n RBPCWLMetric(0.7)\n RBPCWLMetric(0.8)\n RBPCWLMetric(0.9)\n RBPCWLMetric(0.95)\n RBPCWLMetric(0.99)\n\n\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/ireval/cwl", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "cwl-eval", "package_url": "https://pypi.org/project/cwl-eval/", "platform": "", "project_url": "https://pypi.org/project/cwl-eval/", "project_urls": { "Homepage": "https://github.com/ireval/cwl" }, "release_url": "https://pypi.org/project/cwl-eval/1.0.6/", "requires_dist": [ "numpy" ], "requires_python": ">=3", "summary": "An information retrieval evaluation script based on the C/W/L framework that is TREC Compatible and provides a replacement for INST_EVAL, RBP_EVAL, TBG_EVAL, UMeasure and TREC_EVAL scripts. All measurements are reported in the same units making all metrics directly comparable.", "version": "1.0.6" }, "last_serial": 5624683, "releases": { "1.0.6": [ { "comment_text": "", "digests": { "md5": "830abf57ea6a75320bf6e2b807b0fd12", "sha256": "8e358c63fc55b925b5c2ccfaff8994c4564d843a4b158aff9f0353675092188b" }, "downloads": -1, "filename": "cwl_eval-1.0.6-py3-none-any.whl", "has_sig": false, "md5_digest": "830abf57ea6a75320bf6e2b807b0fd12", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3", "size": 35630, "upload_time": "2019-08-02T15:04:51", "url": "https://files.pythonhosted.org/packages/70/a8/cf5d3acfaec7b136eacc6e4df3ac6d1cfa08acfd4f12f7b2297054ec7736/cwl_eval-1.0.6-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "4b77659b854f65744ef6dc7b5d8386ab", "sha256": "9516a364cef91761f8dc4092bb7dd38961a7925e7c5b4d28afabf2d7ed04d07e" }, "downloads": -1, "filename": "cwl-eval-1.0.6.tar.gz", "has_sig": false, "md5_digest": "4b77659b854f65744ef6dc7b5d8386ab", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 27442, "upload_time": "2019-08-02T15:04:54", "url": "https://files.pythonhosted.org/packages/b9/0a/047f1c6bf2447855956c9457cfdeac6d50c89054a4951914c6ba5f5d429b/cwl-eval-1.0.6.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "830abf57ea6a75320bf6e2b807b0fd12", "sha256": "8e358c63fc55b925b5c2ccfaff8994c4564d843a4b158aff9f0353675092188b" }, "downloads": -1, "filename": "cwl_eval-1.0.6-py3-none-any.whl", "has_sig": false, "md5_digest": "830abf57ea6a75320bf6e2b807b0fd12", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3", "size": 35630, "upload_time": "2019-08-02T15:04:51", "url": "https://files.pythonhosted.org/packages/70/a8/cf5d3acfaec7b136eacc6e4df3ac6d1cfa08acfd4f12f7b2297054ec7736/cwl_eval-1.0.6-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "4b77659b854f65744ef6dc7b5d8386ab", "sha256": "9516a364cef91761f8dc4092bb7dd38961a7925e7c5b4d28afabf2d7ed04d07e" }, "downloads": -1, "filename": "cwl-eval-1.0.6.tar.gz", "has_sig": false, "md5_digest": "4b77659b854f65744ef6dc7b5d8386ab", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 27442, "upload_time": "2019-08-02T15:04:54", "url": "https://files.pythonhosted.org/packages/b9/0a/047f1c6bf2447855956c9457cfdeac6d50c89054a4951914c6ba5f5d429b/cwl-eval-1.0.6.tar.gz" } ] }