{ "info": { "author": "Max Hully", "author_email": "max@mggg.org", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7" ], "description": "# maup\n\n[![Build Status](https://api.travis-ci.com/mggg/maup.svg?branch=master)](https://travis-ci.com/mggg/maup)\n[![codecov](https://codecov.io/gh/mggg/maup/branch/master/graph/badge.svg)](https://codecov.io/gh/mggg/maup)\n[![PyPI](https://img.shields.io/pypi/v/maup.svg?color=%23)](https://pypi.org/project/maup/)\n[![conda-forge Package](https://img.shields.io/conda/vn/conda-forge/maup.svg?color=%230099cd)](https://anaconda.org/conda-forge/maup)\n\n`maup` is the geospatial toolkit for redistricting data. The package streamlines\nthe basic workflows that arise when working with blocks, precincts, and\ndistricts, such as\n\n- [Assigning precincts to districts](#assigning-precincts-to-districts),\n- [Aggregating block data to precincts](#aggregating-block-data-to-precincts),\n- [Disaggregating data from precincts down to blocks](#disaggregating-data-from-precincts-down-to-blocks),\n- [Prorating data when units do not nest neatly](#prorating-data-when-units-do-not-nest-neatly),\n and\n- [Fixing overlaps and gaps](#fixing-overlaps-and-gaps)\n\nThe project's priorities are to be efficient by using spatial indices whenever\npossible and to integrate well with the existing ecosystem around\n[pandas](https://pandas.pydata.org/), [geopandas](https://geopandas.org) and\n[shapely](https://shapely.readthedocs.io/en/latest/). The package is distributed\nunder the MIT License.\n\n## Installation\n\nWe recommend installing `maup` from [conda-forge](https://conda-forge.org/)\nusing [conda](https://docs.conda.io/en/latest/):\n\n```console\nconda install -c conda-forge maup\n```\n\nYou can get conda by installing\n[Miniconda](https://docs.conda.io/en/latest/miniconda.html), a free Python\ndistribution made especially for data science and scientific computing. You\nmight also consider [Anaconda](https://www.anaconda.com/distribution/), which\nincludes many data science packages that you might find useful.\n\nTo install `maup` from PyPI, run `pip install maup` from your terminal.\n\n## Examples\n\nHere are some basic situations where you might find `maup` helpful. For these\nexamples, we use test data from Providence, Rhode Island, which you can find in\nour\n[Rhode Island shapefiles repo](https://github.com/mggg-states/RI-shapefiles), or\nin the `examples` folder of this repo.\n\n```python\n>>> import geopandas\n>>> import pandas\n>>>\n>>> blocks = geopandas.read_file(\"zip://./examples/blocks.zip\")\n>>> precincts = geopandas.read_file(\"zip://./examples/precincts.zip\")\n>>> districts = geopandas.read_file(\"zip://./examples/districts.zip\")\n\n```\n\n### Assigning precincts to districts\n\nThe `assign` function in `maup` takes two sets of geometries called `sources`\nand `targets` and returns a pandas `Series`. The Series maps each geometry in\n`sources` to the geometry in `targets` that covers it. (Here, geometry _A_\n_covers_ geometry _B_ if every point of _A_ and its boundary lies in _B_ or its\nboundary.) If a source geometry is not covered by one single target geometry, it\nis assigned to the target geometry that covers the largest portion of its area.\n\n```python\n>>> import maup\n>>>\n>>> assignment = maup.assign(precincts, districts)\n>>> # Add the assigned districts as a column of the `precincts` GeoDataFrame:\n>>> precincts[\"DISTRICT\"] = assignment\n>>> assignment.head()\n0 7\n1 5\n2 13\n3 6\n4 1\ndtype: int64\n\n```\n\nAs an aside, you can use that `assignment` object to create a\n[gerrychain](https://gerrychain.readthedocs.io/en/latest/) `Partition`\nrepresenting this districting plan.\n\n### Aggregating block data to precincts\n\nPrecinct shapefiles usually come with election data, but not demographic data.\nIn order to study their demographics, we need to aggregate demographic data from\ncensus blocks up to the precinct level. We can do this by assigning blocks to\nprecincts and then aggregating the data with a Pandas\n[`groupby`](http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.groupby.html)\noperation:\n\n```python\n>>> variables = [\"TOTPOP\", \"NH_BLACK\", \"NH_WHITE\"]\n>>>\n>>> assignment = maup.assign(blocks, precincts)\n>>> precincts[variables] = blocks[variables].groupby(assignment).sum()\n>>> precincts[variables].head()\n TOTPOP NH_BLACK NH_WHITE\n0 5907 886 380\n1 5636 924 1301\n2 6549 584 4699\n3 6009 435 1053\n4 4962 156 3713\n\n```\n\nIf you want to move data from one set of geometries to another but your source\nand target geometries do not nest neatly (i.e. have overlaps), see\n[Prorating data when units do not nest neatly](#prorating-data-when-units-do-not-nest-neatly).\n\n### Disaggregating data from precincts down to blocks\n\nIt's common to have data at a coarser scale that you want to attach to\nfiner-scaled geometries. Usually this happens when vote totals for a certain\nelection are only reported at the county level, and we want to attach that data\nto precinct geometries.\n\nLet's say we want to prorate the vote totals in the columns `\"PRES16D\"`,\n`\"PRES16R\"` from our `precincts` GeoDataFrame down to our `blocks` GeoDataFrame.\nThe first crucial step is to decide how we want to distribute a precinct's data\nto the blocks within it. Since we're prorating election data, it makes sense to\nuse a block's total population or voting-age population. Here's how we might\nprorate by population (`\"TOTPOP\"`):\n\n```python\n>>> election_columns = [\"PRES16D\", \"PRES16R\"]\n>>> assignment = maup.assign(blocks, precincts)\n>>>\n>>> # We prorate the vote totals according to each block's share of the overall\n>>> # precinct population:\n>>> weights = blocks.TOTPOP / assignment.map(precincts.TOTPOP)\n>>> prorated = maup.prorate(assignment, precincts[election_columns], weights)\n>>>\n>>> # Add the prorated vote totals as columns on the `blocks` GeoDataFrame:\n>>> blocks[election_columns] = prorated\n>>> # We'll call .round(2) to round the values for display purposes.\n>>> blocks[election_columns].round(2).head()\n PRES16D PRES16R\n0 0.00 0.00\n1 12.26 1.70\n2 15.20 2.62\n3 15.50 2.67\n4 3.28 0.45\n\n```\n\n#### Warning about areal interpolation\n\n**We strongly urge you _not_ to prorate by area!** The area of a census block is\n**not** a good predictor of its population. In fact, the correlation goes in the\nother direction: larger census blocks are _less_ populous than smaller ones.\n\n### Prorating data when units do not nest neatly\n\nSuppose you have a shapefile of precincts with some election results data and\nyou want to join that data onto a different, more recent precincts shapefile.\nThe two sets of precincts will have overlaps, and will not nest neatly like the\nblocks and precincts did in the above examples. (Not that blocks and precincts\nalways nest neatly...)\n\nWe can use `maup.intersections` to break the two sets of precincts into pieces\nthat nest neatly into both sets. Then we can disaggregate from the old precincts\nonto these pieces, and aggregate up from the pieces to the new precincts. This\nmove is a bit complicated, so `maup` provides a function called `prorate` that\ndoes just that.\n\nWe'll use our same `blocks` GeoDataFrame to estimate the populations of the\npieces for the purposes of proration.\n\nFor our \"new precincts\" shapefile, we'll use the VTD shapefile for Rhode Island\nthat the U.S. Census Bureau produced as part of their 2018 test run of for the\n2020 Census.\n\n```python\n>>> old_precincts = precincts\n>>> new_precincts = geopandas.read_file(\"zip://./examples/new_precincts.zip\")\n>>>\n>>> columns = [\"SEN18D\", \"SEN18R\"]\n>>>\n>>> # Include area_cutoff=0 to ignore any intersections with no area,\n>>> # like boundary intersections, which we do not want to include in\n>>> # our proration.\n>>> pieces = maup.intersections(old_precincts, new_precincts, area_cutoff=0)\n>>>\n>>> # Weight by prorated population from blocks\n>>> weights = blocks[\"TOTPOP\"].groupby(maup.assign(blocks, pieces)).sum()\n>>> # Normalize the weights so that votes are allocated according to their\n>>> # share of population in the old_precincts\n>>> weights = maup.normalize(weights, level=0)\n>>>\n>>> # Use blocks to estimate population of each piece\n>>> new_precincts[columns] = maup.prorate(\n... pieces,\n... old_precincts[columns],\n... weights=weights\n... )\n>>> new_precincts[columns].head()\n SEN18D SEN18R\n0 752.0 51.0\n1 370.0 21.0\n2 97.0 17.0\n3 585.0 74.0\n4 246.0 20.0\n\n```\n\n### Fixing overlaps and gaps\n\nPrecinct shapefiles are often created by stitching together collections of\nprecinct geometries sourced from different counties or different years. As a\nresult, the shapefile often has gaps or overlaps between precincts where the\ndifferent sources disagree about the boundaries. These gaps and overlaps pose\nproblems when you are interested in working with the adjacency graph of the\nprecincts, and not just in mapping the precincts. This adjacency information is\nespecially important when studying redistricting, because districts are almost\nalways expected to be contiguous.\n\n`maup` provides functions for closing gaps and resolving overlaps in a\ncollection of geometries. As an example, we'll apply both functions to these\ngeometries, which have both an overlap and a gap:\n\n![Four polygons with a gap and some overlaps](./examples/plot.png)\n\nUsually the gaps and overlaps in real shapefiles are tiny and easy to miss, but\nthis exaggerated example will help illustrate the functionality.\n\nFirst, we'll use `shapely` to create the polygons from scratch:\n\n```python\n>>> from shapely.geometry import Polygon\n>>> geometries = geopandas.GeoSeries([\n... Polygon([(0, 0), (2, 0), (2, 1), (1, 1), (1, 2), (0, 2)]),\n... Polygon([(2, 0), (4, 0), (4, 2), (2, 2)]),\n... Polygon([(0, 2), (2, 2), (2, 4), (0, 4)]),\n... Polygon([(2, 1), (4, 1), (4, 4), (2, 4)]),\n... ])\n\n```\n\nNow we'll close the gap:\n\n```python\n>>> without_gaps = maup.close_gaps(geometries)\n>>> without_gaps\n0 POLYGON ((0 0, 2 0, 2 1, 1 1, 1 2, 0 2, 0 0))\n1 POLYGON ((2 0, 4 0, 4 2, 2 2, 2 0))\n2 POLYGON ((0 2, 2 2, 2 4, 0 4, 0 2))\n3 POLYGON ((2 1, 4 1, 4 4, 2 4, 2 1))\ndtype: object\n\n```\n\nThe `without_gaps` geometries look like this:\n\n![Four polygons with two overlapping](./examples/plot_without_gaps.png)\n\nAnd then resolve the overlaps:\n\n```python\n>>> without_overlaps_or_gaps = maup.resolve_overlaps(without_gaps)\n>>> without_overlaps_or_gaps\n0 POLYGON ((0 0, 2 0, 2 1, 1 1, 1 2, 0 2, 0 0))\n1 POLYGON ((2 0, 4 0, 4 2, 2 2, 2 0))\n2 POLYGON ((0 2, 2 2, 2 4, 0 4, 0 2))\n3 POLYGON ((2 1, 4 1, 4 4, 2 4, 2 1))\ndtype: object\n\n```\n\nThe `without_overlaps_or_gaps` geometries look like this:\n\n![Four squares](./examples/plot_without_gaps_or_overlaps.png)\n\nBoth of the functions `resolve_overlaps` and `close_gaps` accept a\n`relative_threshold` argument. This threshold controls how large of a gap or\noverlap the function will attempt to fix. The default value of\n`relative_threshold` is `0.1`, which means that the functions will leave alone\nany gap/overlap whose area is more than 10% of the area of the geometries that\nmight absorb that gap/overlap. In the above example, we set\n`relative_threshold=None` to ensure that no gaps or overlaps were ignored.\n\n## Modifiable areal unit problem\n\nThe name of this package comes from the\n[modifiable areal unit problem (MAUP)](https://en.wikipedia.org/wiki/Modifiable_areal_unit_problem):\nthe same spatial data will look different depending on how you divide up the\nspace. Since `maup` is all about changing the way your data is aggregated and\npartitioned, we have named it after the MAUP to encourage users to use the\ntoolkit thoughtfully and responsibly.\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/mggg/maup", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "maup", "package_url": "https://pypi.org/project/maup/", "platform": "", "project_url": "https://pypi.org/project/maup/", "project_urls": { "Homepage": "https://github.com/mggg/maup" }, "release_url": "https://pypi.org/project/maup/0.6/", "requires_dist": [ "numpy", "pandas", "geopandas", "shapely" ], "requires_python": "", "summary": "The geospatial toolkit for redistricting data", "version": "0.6" }, "last_serial": 5426074, "releases": { "0.2": [ { "comment_text": "", "digests": { "md5": "a2f764fc48e913cfb7d86c260cdc7a50", "sha256": "9770d832e09f300f098751d8d9d3fde8c0b6bc26453e7426ce9911d2c433ac18" }, "downloads": -1, "filename": "maup-0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "a2f764fc48e913cfb7d86c260cdc7a50", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 6138, "upload_time": "2019-03-22T21:45:53", "url": "https://files.pythonhosted.org/packages/59/e9/d78268ffd9d20e83e908d0368d24acc6ef5ba8b5f04e9aefa6b28e63358a/maup-0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "db2f2a320f120cb5b866e892ef2942c5", "sha256": "ebb3394d998b80679e064f4111f39658272152986bd2c60e83a2842bd7ecd39c" }, "downloads": -1, "filename": "maup-0.2.tar.gz", "has_sig": false, "md5_digest": "db2f2a320f120cb5b866e892ef2942c5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 4739, "upload_time": "2019-03-22T21:45:55", "url": "https://files.pythonhosted.org/packages/9b/e3/7e3c25d4933e7e858d54886565f392c1e5c30325ed97de368a1a75514b83/maup-0.2.tar.gz" } ], "0.3": [ { "comment_text": "", "digests": { "md5": "44704c99ef1041c80f2ff4ed661c8f76", "sha256": "563368a1b737225bbc1a7bd71cfd14f57fde765044c72df1bfd0381df574a5af" }, "downloads": -1, "filename": "maup-0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "44704c99ef1041c80f2ff4ed661c8f76", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 7719, "upload_time": "2019-03-25T18:32:45", "url": "https://files.pythonhosted.org/packages/f6/14/9f6ecf8c4abbd3d82dd76488736807b7544cb25f6259b291c7f90d599c96/maup-0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "d2988f2b20769a85e41e3c08fd42256c", "sha256": "694158463cd0b4dc2fe30508e1fcb6a1bbfb2940704bca7a8774cf11715a47a5" }, "downloads": -1, "filename": "maup-0.3.tar.gz", "has_sig": false, "md5_digest": "d2988f2b20769a85e41e3c08fd42256c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6232, "upload_time": "2019-03-25T18:32:47", "url": "https://files.pythonhosted.org/packages/e7/57/255aad70aa1acfb628a8639f8c0a74a415764d24715bc47c54725a87b845/maup-0.3.tar.gz" } ], "0.4": [ { "comment_text": "", "digests": { "md5": "86233f59a7b5cdcc47ba80af84c17fd6", "sha256": "26c2ba32f67d9723b239eba29490f58986f32c44a39c975a2efe568e6c9ced99" }, "downloads": -1, "filename": "maup-0.4-py3-none-any.whl", "has_sig": false, "md5_digest": "86233f59a7b5cdcc47ba80af84c17fd6", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 7840, "upload_time": "2019-03-25T20:15:29", "url": "https://files.pythonhosted.org/packages/2c/70/202392cf9169918d57d5012a2900c875cb01ce04f59d45653906a065e166/maup-0.4-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "956528d8e0be5d431b393063120c0f74", "sha256": "45ab1fe508a779d1452812370608ba29f1c4f4295b6a3c1e27005c0028142287" }, "downloads": -1, "filename": "maup-0.4.tar.gz", "has_sig": false, "md5_digest": "956528d8e0be5d431b393063120c0f74", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6341, "upload_time": "2019-03-25T20:15:30", "url": "https://files.pythonhosted.org/packages/9b/99/4c38b17718553c9451c605427365fbd56cd5ceadfe4f52e506afc9d5e04f/maup-0.4.tar.gz" } ], "0.5": [ { "comment_text": "", "digests": { "md5": "d9ee524c7ba995c9891bbb93c68727e5", "sha256": "95245c0abfa4a6ab7fd87028a4c3a7d4e8cbd5b1b24266e04194691fa9383580" }, "downloads": -1, "filename": "maup-0.5-py3-none-any.whl", "has_sig": false, "md5_digest": "d9ee524c7ba995c9891bbb93c68727e5", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 10389, "upload_time": "2019-06-05T18:46:59", "url": "https://files.pythonhosted.org/packages/cf/e7/ed50bf348395b65715b56fa8f724f0633848d3fe8a233be9fe742c1e482c/maup-0.5-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "1bb2d9a71afe67c0561e970cd3181f6c", "sha256": "861b9ada410bc115969f64b092ae6743e08e4d868e8daa2dc739517b20d4f296" }, "downloads": -1, "filename": "maup-0.5.tar.gz", "has_sig": false, "md5_digest": "1bb2d9a71afe67c0561e970cd3181f6c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9405, "upload_time": "2019-06-05T18:47:01", "url": "https://files.pythonhosted.org/packages/e1/90/570f20a91a08dfab3dc470b0e68d3a4fa8f0d6e3436063665c5daf9a8a38/maup-0.5.tar.gz" } ], "0.6": [ { "comment_text": "", "digests": { "md5": "5649db1803723c427d45cc63155403cb", "sha256": "93fbf97ccfd7570fbeba52bc5c25140d0c16dac10219f61c1a7815b22e0bd052" }, "downloads": -1, "filename": "maup-0.6-py3-none-any.whl", "has_sig": false, "md5_digest": "5649db1803723c427d45cc63155403cb", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 12876, "upload_time": "2019-06-20T14:00:43", "url": "https://files.pythonhosted.org/packages/43/0e/670450b58696670ed1f9fd94e20295cab0e2b23fc867bb0677ca3f58b99d/maup-0.6-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "14a8db6ae85832e0763b74da26acfb96", "sha256": "bc4e0c6e591a2d61ef52f4ebffc11ec0b762a703a535fb24d282d1ff799ad8a8" }, "downloads": -1, "filename": "maup-0.6.tar.gz", "has_sig": false, "md5_digest": "14a8db6ae85832e0763b74da26acfb96", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 14998, "upload_time": "2019-06-20T14:00:45", "url": "https://files.pythonhosted.org/packages/23/40/fefdc9594a169a78fd6a4b38c0125cb666aa45520df8747709974ab9e397/maup-0.6.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "5649db1803723c427d45cc63155403cb", "sha256": "93fbf97ccfd7570fbeba52bc5c25140d0c16dac10219f61c1a7815b22e0bd052" }, "downloads": -1, "filename": "maup-0.6-py3-none-any.whl", "has_sig": false, "md5_digest": "5649db1803723c427d45cc63155403cb", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 12876, "upload_time": "2019-06-20T14:00:43", "url": "https://files.pythonhosted.org/packages/43/0e/670450b58696670ed1f9fd94e20295cab0e2b23fc867bb0677ca3f58b99d/maup-0.6-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "14a8db6ae85832e0763b74da26acfb96", "sha256": "bc4e0c6e591a2d61ef52f4ebffc11ec0b762a703a535fb24d282d1ff799ad8a8" }, "downloads": -1, "filename": "maup-0.6.tar.gz", "has_sig": false, "md5_digest": "14a8db6ae85832e0763b74da26acfb96", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 14998, "upload_time": "2019-06-20T14:00:45", "url": "https://files.pythonhosted.org/packages/23/40/fefdc9594a169a78fd6a4b38c0125cb666aa45520df8747709974ab9e397/maup-0.6.tar.gz" } ] }