{ "info": { "author": "Dan Eads", "author_email": "24708079+daneads@users.noreply.github.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Developers", "License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)", "Programming Language :: Python :: 3", "Topic :: Internet" ], "description": "# pypatent\n\npypatent is a tiny Python package to easily search for and scrape US Patent and Trademark Office Patent Data.\n\n[PyPI page](https://pypi.python.org/pypi/pypatent)\n\n#### New in version 1.2\n\nThis version implements Selenium support for scraping.\nPrevious versions were using the `requests` library for all requests, however this has had problems with the USPTO site lately.\nI notice some users have been able to use `requests` without issue, while others get 4xx errors.\n\nPyPatent Version 1.2 implements an optional new WebConnection object to give the user the option to use Selenium WebDrivers in place of the `requests` library.\nThis WebConnection object is optional.\nIf used, it should be passed as an argument when initializing `Search` or `Patent` objects.\n\nUse it in the following cases:\n* When you want to use Selenium instead of `requests`\n* When you want to use `requests` but with a custom user-agent or headers\n\nSee bottom of README for examples.\n\n## Requirements\n\nPython 3, BeautifulSoup, requests, pandas, re, selenium\n\n## Installation\n\n```\npip install pypatent\n```\n\nIf using Selenium for scraping (introduced in version 1.2), be sure to install a Selenium WebDriver.\nFor Chrome, use `chromedriver`. For Firefox, use `geckodriver`.\nSee [the Selenium download page](https://www.seleniumhq.org/download/) for more details and options.\n\n## Searching for patents\n\nThe Search object works similarly to the [Advanced Search at the USPTO](http://patft.uspto.gov/netahtml/PTO/search-adv.htm), with additional options.\n\n### Specifying patent criteria for your search\n\nThere are two methods to specify your search criteria, and you can use one or both.\n\n#### Search Method 1: Using a custom string\n\nYou may search for a certain string in all fields of the patent:\n```python\npypatent.Search('microsoft') # Will return results matching 'microsoft' in any field\n```\n\nYou may also specify complex search criteria as demonstrated on the [USPTO site](http://patft.uspto.gov/netahtml/PTO/help/helpadv.htm):\n```python\npypatent.Search('TTL/(tennis AND (racquet OR racket))')\n```\n\n#### Search Method 2: Specify USPTO search fields (see Field Codes below)\n\nAlternatively, you can specify one or more Field Code arguments to search within the specified fields. Multiple Field Code arguments will create a search with AND logic. OR logic can be used within a single argument. For more complex logic, use a custom string.\n```python\npypatent.Search(pn='adobe', ttl='software') # Equivalent to search('PN/adobe AND TTL/software')\npypatent.Search(pn=('adobe or macromedia'), ttl='software') # Equivalent to search('PN/(adobe or macromedia) AND TTL/software')\n```\n\n#### Combining search methods 1 and 2\n\nString criteria can be used in conjunction with Field Code arguments:\n```python\npypatent.Search('acrobat', pn='adobe', ttl='software') # Equivalent to search('acrobat AND PN/adobe AND TTL/software')\n```\n\nThe Field Code arguments have the same meaning as on the [USPTO site](http://patft.uspto.gov/netahtml/PTO/search-adv.htm).\n\n### Additional search options\n\n#### Limit the number of results\n\nThe `results_limit` argument lets you change how many patent results are retrieved. The default is 50, equivalent to one page of results.\n\n```python\npypatent.Search('microsoft', results_limit=10) # Fetch 10 results only\n```\n\n#### Specify whether to fetch details for each patent\n\nBy default, pypatent retrieves the details of every patent by visiting each patent's URL from the search results.\nThis can take a long time since each page has to be scraped.\nIf you just need the patent titles and URLs from the search results, set `get_patent_details` to `False`:\n\n```python\npypatent.Search('microsoft', get_patent_details=False) # Fetch patent numbers and titles only\n```\n\n### Formatting your search results\n\npypatent has convenience methods to format the Search object into either a Pandas DataFrame or list of dicts.\n\n#### Format as Pandas DataFrame:\n```python\npypatent.Search('microsoft').as_dataframe()\n```\n\n#### Format as list of dicts:\n```python\npypatent.Search('microsoft', get_patent_details=False).as_list()\n```\n\nSample result (without patent details):\n```\n[{\n 'title': 'Electronic device',\n 'url': 'http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.htm&r=1&p=1&f=G&l=50&d=PTXT&S1=microsoft&OS=microsoft&RS=microsoft'\n },\n\n {'title': 'Portable electric device', ... }\n```\n\n## The Patent class\nThe `Search` class uses the `Patent` class to retrieve and store patent details for a given patent URL.\nYou can use it directly if you already know the patent URL (e.g. you ran a Search with `get_patent_details=False`)\n\n```python\n# Create a Patent object\nthis_patent = pypatent.Patent(title='Base station device, first location management device, terminal device, communication control method, and communication system',\n url='http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.htm&r=4&p=1&f=G&l=50&d=PTXT&S1=aaa&OS=aaa&RS=aaa')\n\n# Fetch the details\nthis_patent.fetch_details()\n```\n\n### Patent Attributes Retrieved:\n\n*Note, not all fields from the patent page are scraped. I hope to add more, and pull requests are appreciated :)*\n\n* patent_num: Patent Number\n* patent_date: Issue Date\n* abstract: Abstract\n* inventors: List of Names of Inventors and Their Locations\n* applicant_name: Applicant Name\n* applicant_city: Applicant City\n* applicant_state: Applicant State\n* applicant_country: Applicant Country\n* assignee_name: Assignee Name\n* assignee_loc: Assignee Location\n* family_id: Family ID\n* applicant_num: Applicant Number\n* file_date: Filing date\n* claims: Claims Description (as a list)\n* description: Patent Description (as a list)\n\n### Field Code Arguments for Search Function\n* PN: Patent Number\n* ISD: Issue Date\n* TTL: Title\n* ABST: Abstract\n* ACLM: Claim(s)\n* SPEC: Description/Specification\n* CCL: Current US Classification\n* CPC: Current CPC Classification\n* CPCL: Current CPC Classification Class\n* ICL: International Classification\n* APN: Application Serial Number\n* APD: Application Date\n* APT: Application Type\n* GOVT: Government Interest\n* FMID: Patent Family ID\n* PARN: Parent Case Information\n* RLAP: Related US App. Data\n* RLFD: Related Application Filing Date\n* PRIR: Foreign Priority\n* PRAD: Priority Filing Date\n* PCT: PCT Information\n* PTAD: PCT Filing Date\n* PT3D: PCT 371 Date\n* PPPD: Prior Published Document Date\n* REIS: Reissue Data\n* RPAF Reissued Patent Application Filing Date\n* AFFF: 130(b) Affirmation Flag\n* AFFT: 130(b) Affirmation Statement\n* IN: Inventor Name\n* IC: Inventor City\n* IS: Inventor State\n* ICN: Inventor Country\n* AANM: Applicant Name\n* AACI: Applicant City\n* AAST: Applicant State\n* AACO: Applicant Country\n* AAAT: Applicant Type\n* LREP: Attorney or agent\n* AN: Assignee Name\n* AC: Assignee City\n* AS: Assignee State\n* ACN: Assignee Country\n* EXP: Primary Examiner\n* EXA: Assistant Examiner\n* REF: Referenced By\n* FREF: Foreign References\n* OREF: Other References\n* COFC: Certificate of Correction\n* REEX: Re-Examination Certificate\n* PTAB: PTAB Trial Certificate\n* SEC: Supplemental Exam Certificate\n* ILRN: International Registration Number\n* ILRD: International Registration Date\n* ILPD: International Registration Publication Date\n* ILFD: Hague International Filing Date\n\n### Changelog\n#### New in version 1.2\n\nThis version implements Selenium support for scraping.\nPrevious versions were using the `requests` library for all requests, however the USPTO site has been causing problems for it.\nI notice some users have been able to use `requests` without issue, while others get 4xx errors.\n\nPyPatent Version 1.2 implements a new WebConnection object to give the user the option to use Selenium WebDrivers in place of the `requests` library.\nThis WebConnection object is optional.\nIf used, it should be passed as an argument when initializing `Search` or `Patent` objects.\nUse it in the following cases:\n* When you want to use Selenium instead of `requests`\n* When you want to use `requests` but with a custom user-agent or headers\n\nAn example using the Firefox WebDriver:\n\n```python\nimport pypatent\nfrom selenium import webdriver\n\ndriver = webdriver.Firefox() # Requires geckodriver in your PATH\n\nconn = pypatent.WebConnection(use_selenium=True, selenium_driver=driver)\n\nres = pypatent.Search('microsoft', get_patent_details=True, web_connection=conn)\n\nprint(res)\n```\n\nAn example using the `requests` library with a custom user agent:\n```python\nimport pypatent\n\nconn = pypatent.WebConnection(use_selenium=False, user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36')\n\nres = pypatent.Search('microsoft', get_patent_details=True, web_connection=conn)\n\nprint(res)\n```\n\nAn example using the `requests` library with default user agent (WebConnection is not necessary here as we are using the defaults)\n```python\nimport pypatent\n\nres = pypatent.Search('microsoft', get_patent_details=True)\n\nprint(res)\n```\n\n#### New in version 1.1:\n\nThis version makes searching and storing patent data easier:\n* Simplified to 2 objects: `Search` and `Patent`\n* A `Search` object searches the USPTO site and can output the results as a DataFrame or list. It can scrape the details of each patent, or just get the patent title and URL. Most users will only need to use this object.\n* A `Patent` object fetches and holds a single patent's info. Fetching the patent's details is now optional. This object should only be used when you already have the patent URL and aren't conducting a search.\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://github.com/daneads/pypatent", "keywords": "patent,uspto,scrape,scraping", "license": "GNU GPLv3", "maintainer": "", "maintainer_email": "", "name": "pypatent", "package_url": "https://pypi.org/project/pypatent/", "platform": "", "project_url": "https://pypi.org/project/pypatent/", "project_urls": { "Homepage": "http://github.com/daneads/pypatent" }, "release_url": "https://pypi.org/project/pypatent/1.2.0/", "requires_dist": [ "bs4", "requests", "pandas", "selenium" ], "requires_python": ">=3", "summary": "Search and retrieve USPTO patent data", "version": "1.2.0" }, "last_serial": 5537807, "releases": { "1.0.0": [ { "comment_text": "", "digests": { "md5": "90185e7ec7d25d51a329f1355779d7ab", "sha256": "1fbcc68f03870c6185a33add54c60bd5e6106953b6719a91abf55a886bf40e00" }, "downloads": -1, "filename": "pypatent-1.0.0-py3-none-any.whl", "has_sig": false, "md5_digest": "90185e7ec7d25d51a329f1355779d7ab", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3", "size": 4108, "upload_time": "2017-10-26T05:34:40", "url": "https://files.pythonhosted.org/packages/5f/8e/0b031a0a3d733518e8bb80f6097e6da56bd7f6b5e394fde39626f8831da8/pypatent-1.0.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "6deffaa7961026eac198fdf33c192562", "sha256": "99dcf9137708c6a92d50c167d92973327f49500eed4a8ec2afc656b726f4e3b6" }, "downloads": -1, "filename": "pypatent-1.0.0.tar.gz", "has_sig": false, "md5_digest": "6deffaa7961026eac198fdf33c192562", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 4469, "upload_time": "2017-10-26T05:34:42", "url": "https://files.pythonhosted.org/packages/33/8f/ecfd370e76de710e589836873d3c90e5dfc882dd56ad572716151705850e/pypatent-1.0.0.tar.gz" } ], "1.0.2": [ { "comment_text": "", "digests": { "md5": "9e53d9b70a48755573866fc9a817dae0", "sha256": "fd7406bbbeac264b5b1e0000613839be04cc8e89e5c88170132ddf136080a62f" }, "downloads": -1, "filename": "pypatent-1.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "9e53d9b70a48755573866fc9a817dae0", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3", "size": 5074, "upload_time": "2018-04-16T23:03:43", "url": "https://files.pythonhosted.org/packages/3f/43/56b35d69074de4ee298f565f6be6e1cc26c3070e021810cd90ab1af6acab/pypatent-1.0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "448f2e2583e015f311bd48369bdce422", "sha256": "6dd5747a590af81badb115adbdf73413c437b895a1191573949c64752e1f8ec8" }, "downloads": -1, "filename": "pypatent-1.0.2.tar.gz", "has_sig": false, "md5_digest": "448f2e2583e015f311bd48369bdce422", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 5446, "upload_time": "2018-04-16T23:03:45", "url": "https://files.pythonhosted.org/packages/a1/cb/eefbbee73845fae94231235771abbfadbd075f4c51691c0ec7992d1006a2/pypatent-1.0.2.tar.gz" } ], "1.1.0": [ { "comment_text": "", "digests": { "md5": "a934b9acc9848cd9e65ab5c28058e00a", "sha256": "5acd61776c0687b051aae5691c7ed93045a23770f95c8b87d5a863e9d97a7c53" }, "downloads": -1, "filename": "pypatent-1.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "a934b9acc9848cd9e65ab5c28058e00a", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3", "size": 18895, "upload_time": "2018-11-26T02:27:03", "url": "https://files.pythonhosted.org/packages/86/9f/5ec85a2c47396b0b2113dc9729f04a11cad09a327b1ccc45d15992566ffa/pypatent-1.1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "55a1def9f8e1d13054e8e7fdb7c3c012", "sha256": "7cddbd944d9ed437e645a61f6b9a9ce25f04ca7f73f57bafcacf4fc667846df1" }, "downloads": -1, "filename": "pypatent-1.1.0.tar.gz", "has_sig": false, "md5_digest": "55a1def9f8e1d13054e8e7fdb7c3c012", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 6712, "upload_time": "2018-11-26T02:27:05", "url": "https://files.pythonhosted.org/packages/c7/8c/63fe37c7ddaafcdf7c7642121bb37517dcaed51eb2c94c82e81c28630577/pypatent-1.1.0.tar.gz" } ], "1.2.0": [ { "comment_text": "", "digests": { "md5": "8a344776850fffbee3e4776f3ab87776", "sha256": "f9712ab3e96d2958b1118fa868be7e9bea270fd04f7544c7003370b4956a2cac" }, "downloads": -1, "filename": "pypatent-1.2.0-py3-none-any.whl", "has_sig": false, "md5_digest": "8a344776850fffbee3e4776f3ab87776", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3", "size": 19726, "upload_time": "2019-07-16T01:42:30", "url": "https://files.pythonhosted.org/packages/7b/4c/0071e2f448cf9125bbf7f987003b7cbae319f03fe5dd11dec29ae1bedb1c/pypatent-1.2.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "c6c828f6f276ed316beedb2c24499161", "sha256": "6ebb2e4e1d8a801a78560607577487c6be97884a1ac1f22cc09350484867fe8d" }, "downloads": -1, "filename": "pypatent-1.2.0.tar.gz", "has_sig": false, "md5_digest": "c6c828f6f276ed316beedb2c24499161", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 7756, "upload_time": "2019-07-16T01:42:31", "url": "https://files.pythonhosted.org/packages/ba/37/deac6e24fcdf9a10d8963bc973f0707f36c0038de7a2cc558e0898324e57/pypatent-1.2.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "8a344776850fffbee3e4776f3ab87776", "sha256": "f9712ab3e96d2958b1118fa868be7e9bea270fd04f7544c7003370b4956a2cac" }, "downloads": -1, "filename": "pypatent-1.2.0-py3-none-any.whl", "has_sig": false, "md5_digest": "8a344776850fffbee3e4776f3ab87776", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3", "size": 19726, "upload_time": "2019-07-16T01:42:30", "url": "https://files.pythonhosted.org/packages/7b/4c/0071e2f448cf9125bbf7f987003b7cbae319f03fe5dd11dec29ae1bedb1c/pypatent-1.2.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "c6c828f6f276ed316beedb2c24499161", "sha256": "6ebb2e4e1d8a801a78560607577487c6be97884a1ac1f22cc09350484867fe8d" }, "downloads": -1, "filename": "pypatent-1.2.0.tar.gz", "has_sig": false, "md5_digest": "c6c828f6f276ed316beedb2c24499161", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3", "size": 7756, "upload_time": "2019-07-16T01:42:31", "url": "https://files.pythonhosted.org/packages/ba/37/deac6e24fcdf9a10d8963bc973f0707f36c0038de7a2cc558e0898324e57/pypatent-1.2.0.tar.gz" } ] }