{ "info": { "author": "Ronald L. Rivest", "author_email": "rivest@mit.edu", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "# consistent_sampler\nRoutine ``sampler`` for providing 'consistent sampling' --- sampling that is\nconsistent across subsets. Consistent sampling works by associating a random number with\neach element; the desired sample is found by taking the subset of the desired sample size\ncontaining those elements with the smallest associated random numbers. \n\nThe sampling is *consistent* since it consistently favors elements with small associated\nrandom numbers; if two sets S and T have substantial overlap, then their samples of a given \nsize will also have substantial overlap (for the same random seed).\n\nThis routine is intended for use in election audits,\nwhere the objects being sampled are ballots, but this procedure\nis for general use. For a similar election audit sampling method,\nsee Stark's election audit tools:\n https://www.stat.berkeley.edu/~stark/Vote/auditTools.htm\n\nThis routine takes as input a finite collection of distinct object ids, a random seed, and\nsome other parameters. The sampling may be \"with replacement\" or \"without replacement\".\nOne of the additional parameters to the routine is \"take\" -- the size of the desired\nsample.\n\nIt provides as output a \"sampling order\" --- an ordered list of object ids that determine\nthe sample. For sampling without replacement, the output can not be longer than the input, as no\nobject may appear in the sample more than once. For sampling with replacement, the output \nmay be infinite in length, as an object may appear in the sample an arbitrarily large \n(even infinite) number of times. The output of ``sampler`` is therefore always a python \ngenerator, capable of producing an infinitely long stream of output object ids.\n\nAs a small example of sampling without replacement:\n\n g = sampler(['A-1', 'A-2', 'A-3', 'B-1', 'B-2', 'B-3'], \n with_replacement=False, take=4, seed=314159, output='id')\n\nyields a generator g whose output can be printed:\n\n print(list(id for id in g))\n\nwhich produces:\n\n ['B-2', 'B-3', 'A-3', 'A-2']\n\nConsistent sampling is not a new idea, see for example\nhttps://arxiv.org/abs/1612.01041\nand the references to consistent sampling therein.\n\nThe routine here may (or may not) be novel in that it extends consistent\nsampling to sampling with replacement: when an item is sampled and then replaced\nin the set of items being sampled, it is given a new random number drawn uniformly\nfrom the set of numbers in (0, 1) larger than its previous associated number.\nTo implement this efficiently and portably, we represent a number in (0, 1) as\na variable-length decimal string of the form '0.dddddd...' . \n\nFor our applications, one big advantage of consistent sampling is the following.\nIf each county collects cast ballots separately, then they can order their own ballots\nfor sampling and interpretation independently of what other counties are doing. An overall\nsample can be constructed from the individual county samples.\n\nFurther documentation and examples are in the code.\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/ron-rivest/consistent_sampler", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "consistentsamplerpkg", "package_url": "https://pypi.org/project/consistentsamplerpkg/", "platform": "", "project_url": "https://pypi.org/project/consistentsamplerpkg/", "project_urls": { "Homepage": "https://github.com/ron-rivest/consistent_sampler" }, "release_url": "https://pypi.org/project/consistentsamplerpkg/1.0.2/", "requires_dist": null, "requires_python": "", "summary": "Package for consistent sampling with or without replacement.", "version": "1.0.2" }, "last_serial": 4184221, "releases": { "1.0.2": [ { "comment_text": "", "digests": { "md5": "4576355f5e17823af90e75d5114f0132", "sha256": "074d5d0669350e3392c3a974f14d38b900c31eb7005913767978624e0f99557a" }, "downloads": -1, "filename": "consistentsamplerpkg-1.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "4576355f5e17823af90e75d5114f0132", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 2823, "upload_time": "2018-08-19T01:13:50", "url": "https://files.pythonhosted.org/packages/a7/fe/b1a80ef801c54584826fc709109bee3b5f63d01fcfdc3c9ba469f4176715/consistentsamplerpkg-1.0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "6042f032ce1d51058baa7ca8d9e6267e", "sha256": "ee0908b978e7f874fbb7f024a96214e7ca29228ab3f84ccd4a5824be489cc861" }, "downloads": -1, "filename": "consistentsamplerpkg-1.0.2.tar.gz", "has_sig": false, "md5_digest": "6042f032ce1d51058baa7ca8d9e6267e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2621, "upload_time": "2018-08-19T01:13:51", "url": "https://files.pythonhosted.org/packages/f8/f1/8e5d15380153856f62bcd63713e8f6e8757892aed726aa1108e8920599aa/consistentsamplerpkg-1.0.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "4576355f5e17823af90e75d5114f0132", "sha256": "074d5d0669350e3392c3a974f14d38b900c31eb7005913767978624e0f99557a" }, "downloads": -1, "filename": "consistentsamplerpkg-1.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "4576355f5e17823af90e75d5114f0132", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 2823, "upload_time": "2018-08-19T01:13:50", "url": "https://files.pythonhosted.org/packages/a7/fe/b1a80ef801c54584826fc709109bee3b5f63d01fcfdc3c9ba469f4176715/consistentsamplerpkg-1.0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "6042f032ce1d51058baa7ca8d9e6267e", "sha256": "ee0908b978e7f874fbb7f024a96214e7ca29228ab3f84ccd4a5824be489cc861" }, "downloads": -1, "filename": "consistentsamplerpkg-1.0.2.tar.gz", "has_sig": false, "md5_digest": "6042f032ce1d51058baa7ca8d9e6267e", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 2621, "upload_time": "2018-08-19T01:13:51", "url": "https://files.pythonhosted.org/packages/f8/f1/8e5d15380153856f62bcd63713e8f6e8757892aed726aa1108e8920599aa/consistentsamplerpkg-1.0.2.tar.gz" } ] }