{ "info": { "author": "Akshowhini", "author_email": "brain@extracttable.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: GNU General Public License v3 (GPLv3)", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7" ], "description": "[![image](https://i.imgur.com/YIHmXue.png?1)](https://extracttable.com?ref=github-CP)\n\n# CamelotPro: Pro-version of Camelot \n[![image](https://img.shields.io/pypi/v/camelotpro.svg?maxAge=3600)](https://pypi.org/project/camelotpro/) [![image](https://img.shields.io/github/license/extracttable/camelotpro)]() [![image](https://img.shields.io/badge/python-3.5%20%7C%203.6%20%7C%203.7-blue)]() \n\n**CamelotPro** is a layer on camelot-py library to extract tables from **Scan PDFs and Images**. \n\n\n## CamelotPro vs Camelot\n\n**CamelotPro** is no different from the original Camelot to code. It comes with extra **`flavor=\"CamelotPro\"`** in read_pdf(), along with regular \"*lattice*\" and \"*stream*\".\n\n\n## Installation \n> \u00f0\u0178\u2019\u00a1 ***ProTip**: [ExtractTable-py](https://github.com/ExtractTable/ExtractTable-py) is the official library, which is FASTER than Camelot wrapper, has NO software dependencies.* \n\n\nAs the library itself is dependent on Camelot which has software dependencies, the developer is expected to install them *(listed below)*, to use the regular Camelot flavors *(\"stream\", \"lattice\")* along with \"CamelotPro\". \n\nPlease follow the **OS-specific instructions** \n\n- [Tkinter](https://camelot-py.readthedocs.io/en/master/user/install-deps.html#os-specific-instructions)\n- [GhostScript](https://camelot-py.readthedocs.io/en/master/user/install-deps.html#for-ghostscript) \n\n\n\n### Using pip \nAfter for Camelot, you can simply use pip to install CamelotPro: \n\n $ pip install -U CamelotPro \n\n\n## Prerequisites\n\nThe developer needs an **api_key** ([free credits here](https://extracttable.com/camelotpro.html)) to use CamelotPro. Each Image file or one PDF page consumes one credit to trigger the process.\n\n**api_key** should be passed through `pro_kwargs`, a `dict` type argument that accepts *api_key*, *job_id*, *dup_check*, *wait_for_output* as keys, can be used as below\n\n {\n \"api_key\": str,\n Mandatory, to trigger \"CamelotPro\" flavor, to process Scan PDFs and images, also text PDF files\n\n \"job_id\": str,\n optional, if processing a new file\n Mandatory, to retrieve the result of the already submitted file\n\n \"dup_check\": bool, default: False - to bypass the duplicate check\n Useful to handle duplicate requests, check based on the FileName\n\n \"max_wait_time\": int, default: 300\n Checks for the output every 15 seconds until successfully processed or for a maximum of 300 seconds.\n }\n\n\n\n## Let's code\n\n**Quickly validate the API key and see number of credits attached to it**\n```python\napi_key = YOUR_API_KEY_HERE\n\nfrom camelot_pro import check_usage\nprint(check_usage(api_key))\n```\n*No error from the above code snippet run implies API Key is valid*\n\n\n**Here's how you can extract tables from Image files.** \n\n\nThe example image (*foo_image.**jpg***) used in the code below, can be found [here](https://github.com/extracttable/camelotpro/blob/master/samples/foo-image.jpg). Notice that *foo_image.jpg* is the image version of Camelot's example, [foo.pdf](https://github.com/camelot-dev/camelot/blob/master/docs/_static/pdf/foo.pdf).\n\n```python\nfrom camelot_pro import read_pdf\npro_tables = read_pdf('foo-image.jpg', flavor=\"CamelotPro\", pro_kwargs={'api_key': api_key}) \n``` \n\n\nNow that you have triggered the process to find tables from the image, you can find the status of it from the `JobStatus` attribute, which returns any of *Success, Failed, Processing, Incomplete*.\n\n pro_tables.JobStatus\n [Out]: \"Success\"\n\n pro_tables[0].df # get a pandas DataFrame! \n\n\n| Col_1 | Col_2 | Col_3 | Col_4 | Col_5 | Col_6 | Col_7 |\n|------------|-----------|---------------|----------------------|-----------------|-----------------|----------------|\n| Cycle Name | KI (1/km) | Distance (mi) | Percent Fuel Savings | | | |\n| | | | Improved Speed | Decreased Accel | Eliminate Stops | Decreased Idle |\n| 2012_2 | 3.30 | 1.3 | 5.9% | 9.5% | 29.2% | 17.4% |\n| 2145_1 | 0.68 | 11.2 | 2.4% | 0.1% | 9.5% | 2.7% |\n| 4234_1 | 0.59 | 58.7 | 8.5% | 1.3% | 8.5% | 3.3% |\n| 2032_2 | 0.17 | 57.8 | 21.7% | 0.3% | 2.7% | 1.2% |\n| 4171_1 | 0.07 | 173.9 | 58.1% | 1.6% | 2.1% | 0.5% |\n\n\nWhen the `JobStatus` status is \"Success\", just like Camelot, the output gives the gist of the process.\n\n pro_tables\n [Out]: # Will be for any other JobStatus\n pro_tables[0].df # get a pandas DataFrame!\n\n... and then there are the regular Camelot functions and attributes\n\n pro_tables.export('foo.csv', f='csv', ) # json, excel, html, sqlite \n\n pro_tables[0]\n [Out]: \n\n pro_tables[0].parsing_report \n [Out]: { \n 'accuracy': 75.12, \n 'whitespace': 0.86, \n 'order': 1, \n 'page': 1 \n }\n\n pro_tables[0].to_csv('foo.csv') # to_json, to_excel, to_html, to_sqlite \n\n\n>***ProTip**: Very useful to check out all attributes of the output, when the `JobStatus` is **not \"Success\"**.\n\n pro_tables.__dict__\n\n [Out]: \n {\n '_tables': [
], # List of tables found with their shapes\n 'Pages': 1, # Number of Input pages, equivalent to credits used\n 'JobStatus': 'Success' # Success | Failed | Processing | Incomplete\n }\n\n\nMost of the image file processes result in an instant 'Success' job status, at times, a blurry/big/bad file may take ~15 seconds and PDF file process time depends on the page count. In these cases, the `JobStatus` is **\"Processing\"** and the `JobId` attribute of the output is used to retrieve tables as shown below.\n\n\n pro_tables.JobStatus\n [Out]: \"Processing\"\n\n job_id = pro_tables.JobId\n print(job_id)\n [Out]: \"d93e9af0f632084394099dabeb150ead7ee2ed5250377cb4772a358abcc21cf2\"\n\n retrieve_output = read_pdf('', flavor=\"CamelotPro\", pro_kwargs={'api_key': api_key, 'job_id': job_id})\n print(retrieve_output.JobStatus)\n [Out]: \"Success\"\n\n\n\n> ***ProTip**: To receive **immediate Success on image files**, use `'dup_check': False` in `pro_kwargs`*\n\n instant_pro_tables = read_pdf('foo-image.jpg', flavor=\"CamelotPro\", pro_kwargs={'api_key': api_key, 'dup_check': False})\n\n\n## New and Re-defined Attributes of CamelotPro\n\n\n|Attribute|Explanation|\n|----|----|\n|`pro_tables.Pages` |Total number of input pages processed. Equivalent to credits used\n|`pro_tables.JobStatus` | \"**Success**\" - Check output for tables or Use \"JobId\" to retrieve tables
\"**Failed**\" - Process Failed, No Credits used
\"**Processing**\" - Still in process, use \"JobId\" to retrieve the output later
\"**Incomplete**\" - Process finished, but all pages are not processed. Partial output|\n|`pro_tables.Message`|Gives the reason for failure or issue,\n|`pro_tables.ProTip`|Hints on how to avoid the errors, if it can be rectified with developer input|\n|`pro_tables[0].accuracy`|Accuracy of text assignment to the cell|\n|`pro_tables[0].accuracy_character`|Accuracy of Characters recognized from the image|\n|`pro_tables[0].accuracy_layout` |Accuracy of table layout's design decision|\n|`pro_tables[0].whitespace`|Percentage of Error in Character recognition\n\n\n\n## Pull Requests & Rewards\n\nPull requests are most welcome and greatly appreciated. \n\n\n## License \n\nThis project is licensed under the GNU-3.0 License, see the [LICENSE](https://github.com/extracttable/camelotpro/blob/master/LICENSE) file for details.\n\n\n## Credits\n\nLast but not least, we want to be thankful to the contributors of [camelot-py](https://github.com/atlanhq/camelot/)\n\n# Social Media\nFollow us on Social media for library updates and free credits.\n\n[![Image](https://cdn3.iconfinder.com/data/icons/socialnetworking/32/linkedin.png)](https://www.linkedin.com/company/extracttable)\n    \n[![Image](https://abs.twimg.com/favicons/twitter.ico)](https://twitter.com/extracttable)\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/ExtractTable/camelotpro", "keywords": "", "license": "GPL-3.0", "maintainer": "", "maintainer_email": "", "name": "CamelotPro", "package_url": "https://pypi.org/project/CamelotPro/", "platform": "", "project_url": "https://pypi.org/project/CamelotPro/", "project_urls": { "Homepage": "https://github.com/ExtractTable/camelotpro" }, "release_url": "https://pypi.org/project/CamelotPro/1.2.0/", "requires_dist": [ "ExtractTable (>=1.1.0)", "camelot-py (>=0.7.3)" ], "requires_python": "", "summary": "CamelotPro is a layer on camelot-py library to extract tables from Scan PDFs and Images.", "version": "1.2.0" }, "last_serial": 6004902, "releases": { "0.7.3": [ { "comment_text": "", "digests": { "md5": "99da03e0798538f4adc26d40436e1a89", "sha256": "798e6c944ea98b94b4afce940024abe2dd054582ef4cb89cf1be15245df39f21" }, "downloads": -1, "filename": "CamelotPro-0.7.3-py3-none-any.whl", "has_sig": false, "md5_digest": "99da03e0798538f4adc26d40436e1a89", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 23714, "upload_time": "2019-08-26T00:13:32", "url": "https://files.pythonhosted.org/packages/9e/d5/e64298b00ec3d77c350ea9bc9303044c837d686f736112ac5bea2622b459/CamelotPro-0.7.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3429926c9a7f97c101c439ca65b58318", "sha256": "137c2e0b759678e5e4ef87c35d1be41d0ee8edfa8faa2c1c87d6165df6310eff" }, "downloads": -1, "filename": "CamelotPro-0.7.3.tar.gz", "has_sig": false, "md5_digest": "3429926c9a7f97c101c439ca65b58318", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9625, "upload_time": "2019-08-26T00:13:34", "url": "https://files.pythonhosted.org/packages/73/2e/443372efe50ef06d6f342eb4957e9bf28d9ed81bc6502fbe44e80e376118/CamelotPro-0.7.3.tar.gz" } ], "0.7.3.1": [ { "comment_text": "", "digests": { "md5": "57809dc14818da845c01cf2b1f199df0", "sha256": "b325046be67196ef1942464579438521601f4d95d8306feeaded3307c10df60c" }, "downloads": -1, "filename": "CamelotPro-0.7.3.1-py3-none-any.whl", "has_sig": false, "md5_digest": "57809dc14818da845c01cf2b1f199df0", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 23780, "upload_time": "2019-08-29T01:28:18", "url": "https://files.pythonhosted.org/packages/18/a3/0d31b0510143b0602fa31bba9485490bc9ee3f1bccd60461ae716594e38a/CamelotPro-0.7.3.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "0af2265fda06870a29dfb9952d4ab094", "sha256": "d3f1d2127f0c9a3f35ccdd7127900a7fd00f9f231f5812f493b7a1c28551352b" }, "downloads": -1, "filename": "CamelotPro-0.7.3.1.tar.gz", "has_sig": false, "md5_digest": "0af2265fda06870a29dfb9952d4ab094", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9753, "upload_time": "2019-08-29T01:27:57", "url": "https://files.pythonhosted.org/packages/02/50/3f58a0a1171dfe37db6d565801ad6aab4144040b8bf8ade7a44b1014066d/CamelotPro-0.7.3.1.tar.gz" } ], "0.7.3.2": [ { "comment_text": "", "digests": { "md5": "d19388076db2b0a9ed76f5c1f0f4082f", "sha256": "607d00eb34bdd9a03d98b45d4e91ad3c39385c31b1c6a67d93821458f12498d0" }, "downloads": -1, "filename": "CamelotPro-0.7.3.2-py3-none-any.whl", "has_sig": false, "md5_digest": "d19388076db2b0a9ed76f5c1f0f4082f", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 24659, "upload_time": "2019-09-26T03:43:09", "url": "https://files.pythonhosted.org/packages/bb/97/bf0c19581ee760d9b89496b156b52a0b81ba27f95d54bf126878acf59b1e/CamelotPro-0.7.3.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "503a09694a75d08cd868b95379edd068", "sha256": "ed4e9267a4ffce15470fcfe03e5ee93d910ff23ba47886dc32c620aefaeb2b41" }, "downloads": -1, "filename": "CamelotPro-0.7.3.2.tar.gz", "has_sig": false, "md5_digest": "503a09694a75d08cd868b95379edd068", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10285, "upload_time": "2019-09-26T03:43:11", "url": "https://files.pythonhosted.org/packages/1f/7a/e837b4ab7e8de96e36d78019165998afd2498d5203710fbf4bd48d25d762/CamelotPro-0.7.3.2.tar.gz" } ], "0.7.3a2": [ { "comment_text": "", "digests": { "md5": "7fa1adc30633b029f1274c549a8c44dd", "sha256": "90815fbc8356a679d46b141686126deaae26d7dc8115f1aa88ceccc814aed206" }, "downloads": -1, "filename": "CamelotPro-0.7.3a2-py3-none-any.whl", "has_sig": false, "md5_digest": "7fa1adc30633b029f1274c549a8c44dd", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 23735, "upload_time": "2019-08-25T19:29:46", "url": "https://files.pythonhosted.org/packages/b2/d9/89d0935168f0cf0e78fdde35aa37da052d85025a575d5981a4bfa3bd4e4b/CamelotPro-0.7.3a2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "e8d9eec38b9f85445554aa75bf524318", "sha256": "7330ce932fa8f66f508ee42df02f3a3413f586d333fb39ad0d68875ea5befdaa" }, "downloads": -1, "filename": "CamelotPro-0.7.3a2.tar.gz", "has_sig": false, "md5_digest": "e8d9eec38b9f85445554aa75bf524318", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9625, "upload_time": "2019-08-25T19:29:48", "url": "https://files.pythonhosted.org/packages/74/82/aa7f7b8f3ab1ac77b05d1ede419a29b49e464c200397a33f4faec0f69bb7/CamelotPro-0.7.3a2.tar.gz" } ], "0.7.4": [ { "comment_text": "", "digests": { "md5": "4a87d2ed8600a35bc7dea1a615506d37", "sha256": "ebde001aeed0d5952fc904faf64062413fc172ce0ffac4bc9d54157f2d1e473f" }, "downloads": -1, "filename": "CamelotPro-0.7.4-py3-none-any.whl", "has_sig": false, "md5_digest": "4a87d2ed8600a35bc7dea1a615506d37", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 24692, "upload_time": "2019-10-04T03:11:19", "url": "https://files.pythonhosted.org/packages/fd/c5/3d436810eb41b9e9c94d10bf74835d970acb2cab87ce145f7c5e44ff0f84/CamelotPro-0.7.4-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "a0bfa69e857c94482508af9aeed2634a", "sha256": "caf2d4f9feda9dbd4fcedf1aba69c5eb53f271e6652daac215c0c7e38a66eb43" }, "downloads": -1, "filename": "CamelotPro-0.7.4.tar.gz", "has_sig": false, "md5_digest": "a0bfa69e857c94482508af9aeed2634a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10360, "upload_time": "2019-10-04T03:11:21", "url": "https://files.pythonhosted.org/packages/43/b4/eff8d46af0e2f3ac5c97133a691d065b0063edceda4a388c94fa13a26dc8/CamelotPro-0.7.4.tar.gz" } ], "1.0.0": [ { "comment_text": "", "digests": { "md5": "252c5b7049e72f5105ce4a1151b97880", "sha256": "ca3f0cc0a0f1294c74f69e1ab2c1ed1d9567d8c88cc85151e497357518270594" }, "downloads": -1, "filename": "CamelotPro-1.0.0-py3-none-any.whl", "has_sig": false, "md5_digest": "252c5b7049e72f5105ce4a1151b97880", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 24795, "upload_time": "2019-10-14T17:50:46", "url": "https://files.pythonhosted.org/packages/80/d7/49345cdf0f111aaa3a4f7c486f53d9c60168b11b90cf4dc207fa9ad3e7a4/CamelotPro-1.0.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "c0beb0d3f31f68bd50f9b7344b885a4b", "sha256": "20c9b182281e884bfd75012c464ae6a461fa3ab8a2db7755dcf2bdee9269aef7" }, "downloads": -1, "filename": "CamelotPro-1.0.0.tar.gz", "has_sig": false, "md5_digest": "c0beb0d3f31f68bd50f9b7344b885a4b", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9729, "upload_time": "2019-10-14T17:50:48", "url": "https://files.pythonhosted.org/packages/88/5b/9a2005916ed89a66564c331b0d84b2810261a203d031215d07d038af52fe/CamelotPro-1.0.0.tar.gz" } ], "1.1.0": [ { "comment_text": "", "digests": { "md5": "272a352d35ae215cf5cbe00fb2711fcf", "sha256": "02acd0ecac99d52ea3f27bb26a3b6161620b99c1d98ca176ebf536deb3c41a8e" }, "downloads": -1, "filename": "CamelotPro-1.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "272a352d35ae215cf5cbe00fb2711fcf", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 25520, "upload_time": "2019-10-20T18:17:40", "url": "https://files.pythonhosted.org/packages/1b/2c/ddfd6406b085964c2fb837f353cfbf612f0107e15a2c18fad0d4a9574923/CamelotPro-1.1.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "ac34a6d788a582d9934129ce50f97ceb", "sha256": "1bfe3416d8c25382b5450d0cf2a08a4e6c52778439b1817402b84a4ea02eeb64" }, "downloads": -1, "filename": "CamelotPro-1.1.0.tar.gz", "has_sig": false, "md5_digest": "ac34a6d788a582d9934129ce50f97ceb", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 10514, "upload_time": "2019-10-20T18:17:43", "url": "https://files.pythonhosted.org/packages/f5/5f/60af56ca6b515fc5e9b2b18754ab28d260109c4b2b2657ed6a8fc0f205cf/CamelotPro-1.1.0.tar.gz" } ], "1.2.0": [ { "comment_text": "", "digests": { "md5": "9614aefdb04985a573448b05e2178c91", "sha256": "4b5e8b491389d7cca6f22b9e29e9b058f2d8196e12501561ed3cf8c132121e9f" }, "downloads": -1, "filename": "CamelotPro-1.2.0-py3-none-any.whl", "has_sig": false, "md5_digest": "9614aefdb04985a573448b05e2178c91", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 25611, "upload_time": "2019-10-20T22:41:40", "url": "https://files.pythonhosted.org/packages/b9/4f/b7e88c343ec64b8656a690bb2ec9c90076c10187f7eb168d58b3ed4422c4/CamelotPro-1.2.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "662f8f2e375c43d27b3c2205f88d30d0", "sha256": "a71c46642818f473f76d6c840335dc28c1198fa92eaba36e32728aec0e283e3f" }, "downloads": -1, "filename": "CamelotPro-1.2.0.tar.gz", "has_sig": false, "md5_digest": "662f8f2e375c43d27b3c2205f88d30d0", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9220, "upload_time": "2019-10-20T22:41:41", "url": "https://files.pythonhosted.org/packages/7f/c4/6cec877f24a2c0b5d11943f8b464a35a2a17aed695e31f009b04d26f8f4e/CamelotPro-1.2.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "9614aefdb04985a573448b05e2178c91", "sha256": "4b5e8b491389d7cca6f22b9e29e9b058f2d8196e12501561ed3cf8c132121e9f" }, "downloads": -1, "filename": "CamelotPro-1.2.0-py3-none-any.whl", "has_sig": false, "md5_digest": "9614aefdb04985a573448b05e2178c91", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 25611, "upload_time": "2019-10-20T22:41:40", "url": "https://files.pythonhosted.org/packages/b9/4f/b7e88c343ec64b8656a690bb2ec9c90076c10187f7eb168d58b3ed4422c4/CamelotPro-1.2.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "662f8f2e375c43d27b3c2205f88d30d0", "sha256": "a71c46642818f473f76d6c840335dc28c1198fa92eaba36e32728aec0e283e3f" }, "downloads": -1, "filename": "CamelotPro-1.2.0.tar.gz", "has_sig": false, "md5_digest": "662f8f2e375c43d27b3c2205f88d30d0", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9220, "upload_time": "2019-10-20T22:41:41", "url": "https://files.pythonhosted.org/packages/7f/c4/6cec877f24a2c0b5d11943f8b464a35a2a17aed695e31f009b04d26f8f4e/CamelotPro-1.2.0.tar.gz" } ] }