{ "info": { "author": "Fabian Hofmann (FIAS), Jonas Hoersch (KIT), Fabian Gotzens (FZ J\u00fclich)", "author_email": "hofmann@fias.uni-frankfurt.de", "bugtrack_url": null, "classifiers": [ "Environment :: Console", "Intended Audience :: Science/Research", "License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)", "Natural Language :: English", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "# powerplantmatching\n\n\n ![https://pypi.org/project/powerplantmatching/](https://img.shields.io/pypi/v/powerplantmatching.svg) ![https://anaconda.org/conda-forge/powerplantmatching](https://img.shields.io/conda/vn/conda-forge/powerplantmatching.svg) ![](https://img.shields.io/pypi/pyversions/powerplantmatching) ![LICENSE](https://img.shields.io/pypi/l/powerplantmatching.svg) [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.1889465.svg)](https://doi.org/10.5281/zenodo.1889465)\n\nA toolset for cleaning, standardizing and combining multiple power\nplant databases.\n\nThis package provides ready-to-use power plant data for the European power system.\nStarting from openly available power plant datasets, the package cleans, standardizes\nand merges the input data to create a new combining dataset, which includes all the important information.\nThe package allows to easily update the combined data as soon as new input datasets are released.\n\n![Map of power plants in Europe](https://user-images.githubusercontent.com/19226431/46086361-36a13080-c1a8-11e8-82ed-9f04167273e5.png)\n\npowerplantmatching was initially developed by the\n[Renewable Energy Group](https://fias.uni-frankfurt.de/physics/schramm/complex-renewable-energy-networks/)\nat [FIAS](https://fias.uni-frankfurt.de/) to build power plant data\ninputs to [PyPSA](http://www.pypsa.org/)-based models for carrying\nout simulations for the [CoNDyNet project](http://condynet.de/),\nfinanced by the\n[German Federal Ministry for Education and Research (BMBF)](https://www.bmbf.de/en/)\nas part of the\n[Stromnetze Research Initiative](http://forschung-stromnetze.info/projekte/grundlagen-und-konzepte-fuer-effiziente-dezentrale-stromnetze/).\n\n\n### What it can do\n\n- clean and standardize power plant data sets\n- aggregate power plants units which belong to the same plant\n- compare and combine different data sets\n- create lookups and give statistical insight to power plant goodness\n- provide cleaned data from different sources\n- choose between gros/net capacity\n- provide an already merged data set of six different data-sources\n- scale the power plant capacities in order to match country specific statistics about total power plant capacities\n- visualize the data\n- export your powerplant data to a [PyPSA](https://github.com/PyPSA/PyPSA) or [TIMES](https://iea-etsap.org/index.php/etsap-tools/model-generators/times) model \n\n\n## Installation\n\n Using pip\n\n```bash\npip install powerplantmatching\n```\n\nor using conda \n\n```bash \nconda install -c conda-forge powerplantmatching\n```\n\n\n## Get the Data\n\nIn order to directly load the already build data into a pandas dataframe just call \n```python\nimport powerplantmatching as pm\npm.powerplants(from_url=True)\n```\n\nwhich will parse and store the [actual dataset of powerplants of this repository](https://raw.githubusercontent.com/FRESNA/powerplantmatching/master/matched_data_red.csv\n). Setting `from_url=False` (default) will load all the necessary data files and combine them. Note that this might take some minutes. \n\n\nThe resulting dataset compared with the capacity statistics provided by the [ENTSOE SO&AF](https://data.open-power-system-data.org/national_generation_capacity/2019-02-22):\n\n![Capacity statistics comparison](https://raw.githubusercontent.com/FRESNA/powerplantmatching/v0.4.1/matching_analysis/factor_plot_Matched%20Data.png)\n\n\n\nThe dataset combines the data of all the data sources listed in\n[Data-Sources](#Data-Sources) and provides the following information:\n\n- **Power plant name** - claim of each database\n- **Fueltype** - {Bioenergy, Geothermal, Hard Coal, Hydro, Lignite, Nuclear, Natural Gas, Oil, Solar, Wind, Other}\n- **Technology**\t\t- {CCGT, OCGT, Steam Turbine, Combustion Engine, Run-Of-River, Pumped Storage, Reservoir}\n- **Set**\t\t\t- {Power Plant (PP), Combined Heat and Power (CHP), Storages (Stores)}\n- **Capacity**\t\t\t- \\[MW\\]\n- **Duration** \t- Maximum state of charge capacity in terms of hours at full output capacity \n- **Dam Information** - Dam volume [Mm^3] and Dam Height [m]\n- **Geo-position**\t\t- Latitude, Longitude\n- **Country** - EU-27 + CH + NO (+ UK) minus Cyprus and Malta\n- **YearCommissioned**\t\t- Commmisioning year of the powerplant\n- **RetroFit** - Year of last retrofit \n- **projectID**\t\t\t- Immutable identifier of the power plant\n\n\n\n### Where is the data stored?\n\nAll data files of the package will be stored in the folder given by `pm.core.package_config['data_dir']`\n\n\n\n## Make your own configuration\n\n\nYou have the option to easily manipulate the resulting data modifying the global configuration. Just save the [config.yaml file](https://github.com/FRESNA/powerplantmatching/blob/v0.4.1/powerplantmatching/package_data/config.yaml) as **~/.powerplantmatching_config.yaml** manually or for linux users \n\n```bash\nwget -O ~/.powerplantmatching_config.yaml https://raw.githubusercontent.com/FRESNA/powerplantmatching/v0.4.1/powerplantmatching/package_data/config.yaml\n```\n\nand change the **.powerplantmaching_config.yaml** file according to your wishes. Thereby you can\n\n\n\n\n\n- determine the global set of **countries** and **fueltypes**\n\n- determine which data sources to combine and which data sources should completely be contained in the final dataset\n\n- individually filter data sources via [pandas.DataFrame.query](http://pandas.pydata.org/pandas-docs/stable/indexing.html#the-query-method) statements set as an argument of data source name. See the default [config.yaml file](https://github.com/FRESNA/powerplantmatching/blob/v0.4.1/powerplantmatching/package_data/config.yaml) as an example\n\n\nOptionally you can:\n\n\n- add your ENTSOE security token to the **.powerplantmaching_config.yaml** file. To enable updating the ENTSOE data by yourself. The token can be obtained by following section 2 of the [RESTful API documentation](https://transparency.entsoe.eu/content/static_content/Static%20content/web%20api/Guide.html#_authentication_and_authorisation) of the ENTSOE-E Transparency platform.\n\n- add your Google API key to the config.yaml file to enable geoparsing. The key can be obtained by following the [instructions](https://developers.google.com/maps/documentation/geocoding/get-api-key). \n\n\n\n\n\n## Data-Sources:\n\n- OPSD - [Open Power System Data](http://data.open-power-system-data.org/) publish their [data](http://data.open-power-system-data.org/conventional_power_plants/) under a free license\n- GEO - [Global Energy Observatory](http://globalenergyobservatory.org/), the data is not directly available on the website, but can be obtained from an [sqlite scraper](https://morph.io/coroa/global_energy_observatory_power_plants)\n- GPD - [Global Power Plant Database](http://datasets.wri.org/dataset/globalpowerplantdatabase) provide their data under a free license\n- CARMA - [Carbon Monitoring for Action](http://carma.org/plant)\n- ENTSOe - [European Network of Transmission System Operators for Electricity](http://entsoe.eu/), annually provides statistics about aggregated power plant capacities. Their data can be used as a validation reference. We further use their [annual energy generation report from 2010](https://www.entsoe.eu/db-query/miscellaneous/net-generating-capacity) as an input for the hydro power plant classification. The [power plant dataset](https://transparency.entsoe.eu/generation/r2/installedCapacityPerProductionUnit/show) on the ENTSO-E transparency website is downloaded using the [ENTSO-E Transparency API](https://transparency.entsoe.eu/content/static_content/Static%20content/web%20api/Guide.html).\n- JRC - [Joint Research Centre Hydro-power plants database](https://github.com/energy-modelling-toolkit/hydro-power-database)\n- IRENA - [International Renewable Energy Agency](http://resourceirena.irena.org/gateway/dashboard/) open available statistics on power plant capacities.\n- BNETZA - [Bundesnetzagentur](https://www.bundesnetzagentur.de/EN/Areas/Energy/Companies/SecurityOfSupply/GeneratingCapacity/PowerPlantList/PubliPowerPlantList_node.html) open available data source for Germany's power plants\n\n\nThe merged dataset is available in two versions: The bigger dataset, obtained by \n\n```python\npm.powerplants(reduced=False)\n```\n\nlinks the entries of the matched power plants and lists all the related\nproperties given by the different data-sources. The smaller, reduced dataset, given by\n```python\npm.powerplants()\n```\nclaims only the value of the most reliable data source being matched in the individual power plant data entry.\nThe considered reliability scores are:\n\n\n| Dataset | Reliabilty score |\n| :--------------- | :--------------- |\n| JRC | 6 |\n| ESE | 6 |\n| UBA | 5 |\n| OPSD | 5 |\n| OPSD_EU | 5 |\n| OPSD_DE | 5 |\n| WEPP | 4 |\n| ENTSOE | 4 |\n| IWPDCY | 3 |\n| GPD | 3 |\n| GEO | 3 |\n| BNETZA | 3 |\n| CARMA | 1 |\n\n\n\n## Getting Started\n\nA small presentation of the tool is given in the [jupyter notebook](https://github.com/FRESNA/powerplantmatching/blob/master/Example%20of%20Use.ipynb) \n\n\n\n## How it works\n\nWhereas single databases as the CARMA, GEO or the OPSD database provide non standardized and incomplete information, the datasets can complement each other and improve their reliability. \nIn a first step, powerplantmatching converts all powerplant dataset into a standardized format with a defined set of columns and values. The second part consists of aggregating power plant blocks together into units. Since some of the datasources provide their powerplant records on unit level, without detailed information about lower-level blocks, comparing with other sources is only possible on unit level. In the third and name-giving step the tool combines (or matches)different, standardized and aggregated input sources keeping only powerplants units which appear in more than one source. The matched data afterwards is complemented by data entries of reliable sources which have not matched. \n\nThe aggregation and matching process heavily relies on\n[DUKE](https://github.com/larsga/Duke), a java application specialized\nfor deduplicating and linking data. It provides many built-in\ncomparators such as numerical, string or geoposition comparators. The\nengine does a detailed comparison for each single argument (power\nplant name, fuel-type etc.) using adjusted comparators and weights.\nFrom the individual scores for each column it computes a compound\nscore for the likeliness that the two powerplant records refer to the\nsame powerplant. If the score exceeds a given threshold, the two\nrecords of the power plant are linked and merged into one data set.\n\nLet's make that a bit more concrete by giving a quick\nexample. Consider the following two data sets\n\n### Dataset 1:\n\n| | Name | Fueltype | Classification | Country | Capacity | lat | lon | File |\n|---:|:--------------------|:-----------|-----------------:|:---------------|-----------:|--------:|-----------:|-------:|\n| 0 | Aarberg | Hydro | nan | Switzerland | 14.609 | 47.0444 | 7.27578 | nan |\n| 1 | Abbey mills pumping | Oil | nan | United Kingdom | 6.4 | 51.687 | -0.0042057 | nan |\n| 2 | Abertay | Other | nan | United Kingdom | 8 | 57.1785 | -2.18679 | nan |\n| 3 | Aberthaw | Coal | nan | United Kingdom | 1552.5 | 51.3875 | -3.40675 | nan |\n| 4 | Ablass | Wind | nan | Germany | 18 | 51.2333 | 12.95 | nan |\n| 5 | Abono | Coal | nan | Spain | 921.7 | 43.5588 | -5.72287 | nan |\n\nand\n\n### Dataset 2:\n\n| | Name | Fueltype | Classification | Country | Capacity | lat | lon | File |\n|---:|:------------------|:------------|:-----------------|:---------------|-----------:|--------:|--------:|-------:|\n| 0 | Aarberg | Hydro | nan | Switzerland | 15.5 | 47.0378 | 7.272 | nan |\n| 1 | Aberthaw | Coal | Thermal | United Kingdom | 1500 | 51.3873 | -3.4049 | nan |\n| 2 | Abono | Coal | Thermal | Spain | 921.7 | 43.5528 | -5.7231 | nan |\n| 3 | Abwinden asten | Hydro | nan | Austria | 168 | 48.248 | 14.4305 | nan |\n| 4 | Aceca | Oil | CHP | Spain | 629 | 39.941 | -3.8569 | nan |\n| 5 | Aceca fenosa | Natural Gas | CCGT | Spain | 400 | 39.9427 | -3.8548 | nan |\n\nwhere Dataset 2 has the higher reliability score. Apparently entries 0, 3 and 5 of Dataset 1 relate to the same\npower plants as the entries 0,1 and 2 of Dataset 2. The toolset detects those similarities and combines them into the following set, but prioritising the values of Dataset 2:\n\n| | Name | Country | Fueltype | Classification | Capacity | lat | lon | File |\n|---:|:------------|:---------------|:-----------|:-----------------|-----------:|--------:|---------:|-------:|\n| 0 | Aarberg | Switzerland | Hydro | nan | 15.5 | 47.0378 | 7.272 | nan |\n| 1 | Aberthaw | United Kingdom | Coal | Thermal | 1500 | 51.3873 | -3.4049 | nan |\n| 2 | Abono | Spain | Coal | Thermal | 921.7 | 43.5528 | -5.7231 | nan |\n\n\n## Citing powerplantmatching\n\nIf you want to cite powerplantmatching, use the following paper\n\n\n- F. Gotzens, H. Heinrichs, J. H\u00f6rsch, and F. Hofmann, [Performing energy modelling exercises in a transparent way - The issue of data quality in power plant databases](https://www.sciencedirect.com/science/article/pii/S2211467X18301056?dgcid=author), Energy Strategy Reviews, vol. 23, pp. 1\u201312, Jan. 2019.\n\nwith bibtex\n\n\n```\n@article{gotzens_performing_2019,\n\ttitle = {Performing energy modelling exercises in a transparent way - {The} issue of data quality in power plant databases},\n\tvolume = {23},\n\tissn = {2211467X},\n\turl = {https://linkinghub.elsevier.com/retrieve/pii/S2211467X18301056},\n\tdoi = {10.1016/j.esr.2018.11.004},\n\tlanguage = {en},\n\turldate = {2018-12-03},\n\tjournal = {Energy Strategy Reviews},\n\tauthor = {Gotzens, Fabian and Heinrichs, Heidi and H\u00f6rsch, Jonas and Hofmann, Fabian},\n\tmonth = jan,\n\tyear = {2019},\n\tpages = {1--12}\n}\n```\n\n\nand/or the current release stored on Zenodo with a release-specific DOI:\n\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.1889465.svg)](https://doi.org/10.5281/zenodo.1889465)\n\n\n\n## Acknowledgements\n\nThe development of powerplantmatching was helped considerably by\nin-depth discussions and exchanges of ideas and code with\n\n- Tom Brown from Karlsruhe Institute for Technology\n- Chris Davis from University of Groningen and\n- Johannes Friedrich, Roman Hennig and Colin McCormick of the World Resources Institute\n\n## Licence\n\nCopyright 2018-2020 Fabian Gotzens (FZ J\u00fclich), Jonas H\u00f6rsch (KIT), Fabian Hofmann (FIAS)\n\n\n\npowerplantmatching is released as free software under the\n[GPLv3](http://www.gnu.org/licenses/gpl-3.0.en.html), see\n[LICENSE](LICENSE) for further information.\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/FRESNA/powerplantmatching", "keywords": "", "license": "GPLv3", "maintainer": "", "maintainer_email": "", "name": "powerplantmatching", "package_url": "https://pypi.org/project/powerplantmatching/", "platform": "", "project_url": "https://pypi.org/project/powerplantmatching/", "project_urls": { "Homepage": "https://github.com/FRESNA/powerplantmatching" }, "release_url": "https://pypi.org/project/powerplantmatching/0.4.1/", "requires_dist": [ "numpy", "scipy", "pandas (>=0.23.0)", "networkx (>=1.10)", "pycountry", "xlrd", "seaborn", "pyyaml", "requests", "matplotlib", "geopy", "entsoe-py" ], "requires_python": "", "summary": "Toolset for generating and managing Power Plant Data", "version": "0.4.1" }, "last_serial": 5624548, "releases": { "0.4.0": [ { "comment_text": "", "digests": { "md5": "68c7adb3743cba11592e528cec3f4ee8", "sha256": "f841a47936b88c696e56774cd4d9689ac8bac12d192facd0e4010b30b13ee701" }, "downloads": -1, "filename": "powerplantmatching-0.4.0-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "68c7adb3743cba11592e528cec3f4ee8", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 704327, "upload_time": "2019-07-15T17:49:55", "url": "https://files.pythonhosted.org/packages/96/d4/1365f55786888935d76a2fba2cc02b568b74512f73469748d0759456294b/powerplantmatching-0.4.0-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "f592dbe689ad689755a2d3e05053235a", "sha256": "f84b8b715e00a6bd74c340a04f71bbd00c6e959916e4b9d85e6450358cf2f84e" }, "downloads": -1, "filename": "powerplantmatching-0.4.0.tar.gz", "has_sig": false, "md5_digest": "f592dbe689ad689755a2d3e05053235a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 701143, "upload_time": "2019-07-15T17:49:58", "url": "https://files.pythonhosted.org/packages/09/5d/68a1a058554904d15f3caa695f686f01b988c33b230988dd359628087a9d/powerplantmatching-0.4.0.tar.gz" } ], "0.4.1": [ { "comment_text": "", "digests": { "md5": "f15a30b77721039ca4abffe8dd6a573f", "sha256": "b77a82e3532aa3c43da55e6096c296e9c3eb6c13b95673e2069cdd9a53e682ab" }, "downloads": -1, "filename": "powerplantmatching-0.4.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "f15a30b77721039ca4abffe8dd6a573f", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 711471, "upload_time": "2019-08-02T14:40:16", "url": "https://files.pythonhosted.org/packages/f8/d5/84ed42b40b1f118edfdaa474bb09c9682b1f95328b90b274a681dcfa7e83/powerplantmatching-0.4.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "f05e80faafe488dbb7c6b07950afb8f5", "sha256": "f8adbd21322a9ae78a412c62bd31bc9d54e498389ac305a405bda2a15bf06e94" }, "downloads": -1, "filename": "powerplantmatching-0.4.1.tar.gz", "has_sig": false, "md5_digest": "f05e80faafe488dbb7c6b07950afb8f5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 708074, "upload_time": "2019-08-02T14:40:36", "url": "https://files.pythonhosted.org/packages/c4/75/3f797a33946f0f05469d9c981a95d3ac1fe7083c7b2625fd399141336464/powerplantmatching-0.4.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "f15a30b77721039ca4abffe8dd6a573f", "sha256": "b77a82e3532aa3c43da55e6096c296e9c3eb6c13b95673e2069cdd9a53e682ab" }, "downloads": -1, "filename": "powerplantmatching-0.4.1-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "f15a30b77721039ca4abffe8dd6a573f", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 711471, "upload_time": "2019-08-02T14:40:16", "url": "https://files.pythonhosted.org/packages/f8/d5/84ed42b40b1f118edfdaa474bb09c9682b1f95328b90b274a681dcfa7e83/powerplantmatching-0.4.1-py2.py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "f05e80faafe488dbb7c6b07950afb8f5", "sha256": "f8adbd21322a9ae78a412c62bd31bc9d54e498389ac305a405bda2a15bf06e94" }, "downloads": -1, "filename": "powerplantmatching-0.4.1.tar.gz", "has_sig": false, "md5_digest": "f05e80faafe488dbb7c6b07950afb8f5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 708074, "upload_time": "2019-08-02T14:40:36", "url": "https://files.pythonhosted.org/packages/c4/75/3f797a33946f0f05469d9c981a95d3ac1fe7083c7b2625fd399141336464/powerplantmatching-0.4.1.tar.gz" } ] }