{ "info": { "author": "Evan Sonderegger", "author_email": "evan@rpy.xyz", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Environment :: Console", "Intended Audience :: Developers", "License :: OSI Approved :: Apache Software License", "Operating System :: OS Independent", "Programming Language :: Python" ], "description": "This is a library for converting campaign finance filings stored in the .fec format into native python objects. It maps the comma/ASCII 28 delimited fields to canonical names based on the version the filing uses and then converts the values that are dates and numbers into the appropriate `int`, `float`, or `datetime` objects.\n\n\n\nThis library is in relatively early testing. I've used it on a couple of projects, but I wouldn't trust it to work on all filings. That said, if you do try using it, I'd love to hear about it!\n\n\n\n## Why?\n\nThe FEC makes a ton of data available via the \"export\" links on the main site and the [developer API](https://api.open.fec.gov/developers/). For cases where those data sources are sufficient, they are almost certainly the easiest/best way to go. A few cases where one might need to be digging into raw filings are:\n\n\n\n- Getting information from individual itemizations including addresses. (The FEC doesn't include street addresses in bulk downloads.)\n\n- Getting data as soon as it has been filed, instead of waiting for it to be coded. (The FEC generally codes all filings received by 7pm eastern by 7am the next day. However, that means that a filing received at 11:59pm on Monday wouldn't be available until 7am on Wednesday, for example.)\n\n- Getting more data than the rate-limit on the developer API would allow.\n\n- Maintaining one's own database with all relevant campaign finance data, perhaps synced with another data source.\n\n\n\nRaw filings can be found by either downloading the [bulk data](https://www.fec.gov/data/advanced/?tab=bulk-data) zip files or from http requests like [this](https://docquery.fec.gov/dcdev/posted/1229017.fec). This library includes helper methods for both.\n\n\n\n## Installation\n\nTo get started, install from [pypi](https://pypi.org/project/fecfile/) by running the following command in your preferred terminal:\n\n\n\n```\n\npip install fecfile\n\n```\n\n\n\n## Usage\n\nFor the vast majority of filings, the easiest way to use this library will be to load filings all at once by using the `from_http(file_number)`, `from_file(file_path)`, or `loads(input)` methods.\n\n\n\nThese methods will return a Python dictionary, with keys for `header`, `filing`, `itemizations`, and `text`. The `itemizations` dictionary contains lists of itemizations grouped by type (`Schedule A`, `Schedule B`, etc.).\n\n\n\n### Examples:\n\n\n\n```\n\nimport fecfile\n\n\n\nfiling1 = fecfile.from_file('1229017.fec')\n\nprint('${:,.2f}'.format(filing1['filing']['col_a_total_receipts']))\n\n\n\nfiling2 = fecfile.from_http(1146148)\n\nprint(filing2['filing']['committee_name'])\n\n\n\nfiling3 = fecfile.from_http(1146148)\n\nall_contributions = filing3['itemizations']['Schedule B']\n\nmid_size_contributions = [item for item in all_contributions if 500 <= item[contribution_amount] < 1000]\n\nprint(len(mid_size_contributions))\n\n\n\nwith open('1229017.fec') as file:\n\n parsed = fecfile.loads(file.read())\n\n num_disbursements = len(parsed['itemizations']['Schedule B'])\n\n print(num_disbursements)\n\n\n\nurl = 'https://docquery.fec.gov/dcdev/posted/1229017.fec'\n\nr = requests.get(url, headers={'User-Agent': 'Mozilla/5.0'})\n\nparsed = fecfile.loads(r.text)\n\nfecfile.print_example(parsed)\n\n```\n\n\n\nNote: the docquery.fec.gov urls cause problems with the requests library when a user-agent is not supplied. There may be a cleaner fix to that though.\n\n\n\n## Advanced Usage\n\n\n\nFEC filings can be arbitrarily large. Loading enormous filings into memory all at once can cause problems (including running out of memory). \n\n\n\nThe `fecfile` library exposes the `iter_file` and `iter_http` methods to read large filings one line at a time. Both are generator functions that yield `FecItem` objects, which consist of `data` and `data_type` attributes. The data_type attribute can be one of \"header\", \"summary\", \"itemization\", \"text\", or \"F99_text\". The data attribute is a dictionary for all data types except for \"F99_text\", for which it is a string.\n\n\n\n```\n\nimport fecfile\n\nimport imaginary_database\n\n\n\n# Sometimes we only care about summary data, but want to be able to handle all filings, without\n\n# knowing anything about them before we attempt to parse.\n\nno_itemizations = {'filter_itemizations': []}\n\nfor i in range(1300000, 1320000):\n\n for item in fecfile.iter_http(i, options=no_itemizations):\n\n if item.data_type == 'summary':\n\n imaginary_database.add_to_db(item.data)\n\n\n\n# Sometimes we only care about one type of itemization, but from a very large filing.\n\n# In this example, we add up all the contributions from Delaware in ActBlue's 2018\n\n# post-general filing\n\nonly_contributions = {'filter_itemizations': ['SA']}\n\nde_total = 0\n\nfor item in fecfile.iter_http(1300352, options=only_contributions):\n\n if item.data_type == 'itemization':\n\n if item.data['contributor_state'] == 'DE':\n\n de_total += item.data['contribution_amount']\n\nprint(de_total)\n\n\n\n# Sometimes we want to maintain a database where different types of itemizations live in their own\n\n# tables and have foreign key relationships to a summary record.\n\nfile_path = '/path/to/99840.fec'\n\nfiling = None\n\nfor item in fecfile.iter_file(file_path):\n\n if item.data_type == 'summary':\n\n filing = imaginary_database.add_filing(file_number=99840, **item.data)\n\n if item.data_type == 'itemization':\n\n if item.data['form_type'].startswith('SA'):\n\n imaginary_database.add_contribution(filing=filing, **item.data)\n\n if item.data['form_type'].startswith('SB'):\n\n imaginary_database.add_disbursement(filing=filing, **item.data)\n\n if item.data['form_type'].startswith('SC'):\n\n imaginary_database.add_loan(filing=filing, **item.data)\n\n```\n\n\n\nYou can also choose to use the `parse_header` and `parse_line` methods if you are implementing a different method of\n\niterating over a filing's content. Before version 0.6, the below example was the only way to use `fecfile` to parse\n\nfilings without loading the entire filing into memory. This approach should no longer be necessary, but is kept to\n\nshow how example usage for those methods.\n\n\n\n```\n\nimport fecfile\n\n\n\nversion = None\n\n\n\nwith open('1263179.fec') as file:\n\n for line in file:\n\n if version is None:\n\n header, version = fecfile.parse_header(line)\n\n else:\n\n parsed = fecfile.parse_line(line, version)\n\n save_to_db(parsed)\n\n```\n\n\n\n\n\n## API Reference\n\n\n\n