{ "info": { "author": "Arthur RENAUD ; Antoine MARULLAZ => Stackadoc", "author_email": "arthur.b.renaud@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 1 - Planning", "License :: OSI Approved", "Natural Language :: French", "Operating System :: OS Independent", "Programming Language :: Python", "Programming Language :: Python :: 3.6", "Topic :: Communications" ], "description": "# pdfannot\n\nThis package aims to create a two-way link between annotated pdf and excel data frame.\n\nIt allows you to :\n\n - create an DataFrame containing each string annotated of the pdf in a column 'annot_text', along with its annotation in a column 'label' and information such as coordinates, page etc.\n - annotate a pdf given an DataFrame of the form described above.\n \nIt can be really useful for generating automatically annotated pdf documents with NLP models capable to\ninfer annotations from raw texts in a data frame.\n\n\n### Prerequisites\n\n- pandas\n- fitz\n\n(pip install pymupdf)\n\n\n\n### Installing\n\npip install pdfannot\n\n\n### Examples\n\nYour DataFrame must contains info on what to annotate on the pdf :\n\n import pdfannot\n import pandas as pd\n \n # adf stands for annotation dataframe\n adf = pd.DataFrame([\n {'x': 40, 'y': 60, 'w': 300, 'h': 50}, \n {'text': 'APPEAL relating to Cancellation Proceedings No 399', 'type': 'Highlight'},\n {'text': 'ication for a declaration of i', 'type': 'Highlight', 'label': 'label 1'},\n {'x': 100, 'y': 600, 'w': 300, 'h': 50, 'page': 1, 'label': 'label 2'}, \n ])\n \n # pdfannot.exple_pdf is a test pdf shipped with pdf annot package / debug is set to True for some verbose\n pdfannot.annotate_pdf(adf, pdfannot.exple_pdf, '/tmp/test.pdf', debug=True)\n\n\n \nYour annotation dataframe should have already columns 'x','y','h','w' (coordinate to make a square) or 'text' (texts to annotate).\n \n annotate_pdf(DataFrame, orig_pdfpath, dest_pdfpath)\n \nwill use your DataFrame and the directory of your pdf passed in argument to annotate it and store it at dest_pdfpath.\n\nThe function also considers optional columns 'label' to label your annotations and 'type' to specify whether you want \na 'Square' or a 'Highlight'. \n\nDefaults are label : '' and type : 'Square'. \n\nFinally, specifying the annotation's page with a column 'page' speeds up the algorithm. \"page\" is optional for 1 page pdfs.\n\n\n### Internals\n\nHowever if your DataFrame has one text column per label of annotation (WARNING : each of them must be named annot_{label_name}) then you can transform it with :\n\n annot_utils.dlf2adf(DataFrame)\n\nto make it acceptable by annotate_pdf. After this execute :\n\n annotate_pdf(DataFrame, orig_pdfpath, dest_pdfpath)\n\nto annotate your pdf (this method allows only highlights).\n \n \n### Authors\n\nArthur Renaud, Antoine Marullaz --> Stackadoc\n\nAny recommendation/question ? --> contact@stackadoc.com", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://bitbucket.org/ArthurRenaud/pdfannot/", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "pdfannot", "package_url": "https://pypi.org/project/pdfannot/", "platform": "", "project_url": "https://pypi.org/project/pdfannot/", "project_urls": { "Homepage": "https://bitbucket.org/ArthurRenaud/pdfannot/" }, "release_url": "https://pypi.org/project/pdfannot/2019.6.5.1/", "requires_dist": null, "requires_python": ">=3.5", "summary": "PDF Annotation Utils", "version": "2019.6.5.1" }, "last_serial": 5362404, "releases": { "0.0.0": [ { "comment_text": "", "digests": { "md5": "c8c6ce25380deef4b5d1b9fc7b84f285", "sha256": "d6e4661b96dec618df496dfe7b6c31d79a4f8921e534b1f716c200bbbce102a0" }, "downloads": -1, "filename": "pdfannot-0.0.0.tar.gz", "has_sig": false, "md5_digest": "c8c6ce25380deef4b5d1b9fc7b84f285", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 359356, "upload_time": "2019-04-19T08:53:05", "url": "https://files.pythonhosted.org/packages/45/86/f453c52644852b31f5bef8396f57bd1a263d221d51d32283b79116a74ba5/pdfannot-0.0.0.tar.gz" } ], "0.0.1": [ { "comment_text": "", "digests": { "md5": "e1047c6606169b3cfa509ce0b3e06429", "sha256": "ecb03ebbf4ef072b5f17d8811e89b371de50abed65f53fd72028ff782ba23fc1" }, "downloads": -1, "filename": "pdfannot-0.0.1.tar.gz", "has_sig": false, "md5_digest": "e1047c6606169b3cfa509ce0b3e06429", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 359567, "upload_time": "2019-04-23T07:56:59", "url": "https://files.pythonhosted.org/packages/a0/52/9ab7ba830ed38733b6fe9da5b77bf4169026a2647e214acc95ead351f32d/pdfannot-0.0.1.tar.gz" } ], "0.0.10": [ { "comment_text": "", "digests": { "md5": "5b505e78696933c654884c4c1760359a", "sha256": "8d6266b66840268589de6474b7e5411f6d7a4a1eb6f953cfa5e9880cc17733a1" }, "downloads": -1, "filename": "pdfannot-0.0.10.tar.gz", "has_sig": false, "md5_digest": "5b505e78696933c654884c4c1760359a", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 363379, "upload_time": "2019-05-14T10:01:41", "url": "https://files.pythonhosted.org/packages/97/96/f4b64435ed645057a8f45ea6aa639b0d37c571d69914463ebc2530091f04/pdfannot-0.0.10.tar.gz" } ], "0.0.11": [ { "comment_text": "", "digests": { "md5": "858e938d0e13194198d0d13f9dba38f2", "sha256": "52dd216a39bd089c375ae91b02a7bc11a43fe8284e421536d921d97c7ad3883e" }, "downloads": -1, "filename": "pdfannot-0.0.11.tar.gz", "has_sig": false, "md5_digest": "858e938d0e13194198d0d13f9dba38f2", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 363460, "upload_time": "2019-05-27T09:11:27", "url": "https://files.pythonhosted.org/packages/bd/4b/cc316fff0387dd3958a5fedc6311790dee3fe256d3823c8aa160eca12865/pdfannot-0.0.11.tar.gz" } ], "0.0.12": [ { "comment_text": "", "digests": { "md5": "e0acd40f6a20fcd520695c6f7e01bb61", "sha256": "6620479598a31129d8b26865446e8fcc7952c29c136376fa6006b4cabc80dfce" }, "downloads": -1, "filename": "pdfannot-0.0.12.tar.gz", "has_sig": false, "md5_digest": "e0acd40f6a20fcd520695c6f7e01bb61", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 465349, "upload_time": "2019-05-27T09:11:29", "url": "https://files.pythonhosted.org/packages/6d/d2/49626aeb88592f6b03ed1d9421f758b8772e925ed45b2f951c00e78460cd/pdfannot-0.0.12.tar.gz" } ], "0.0.13": [ { "comment_text": "", "digests": { "md5": "71fa1ebda547cfaccc484fed389f318c", "sha256": "4fe1ac9bb0fd78b7b9df7b800dfc7a1b116e7fa8b4acaa5bb1e066bdb7467fe7" }, "downloads": -1, "filename": "pdfannot-0.0.13.tar.gz", "has_sig": false, "md5_digest": "71fa1ebda547cfaccc484fed389f318c", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 36314, "upload_time": "2019-05-27T13:56:31", "url": "https://files.pythonhosted.org/packages/57/3d/c9397e0d1dc301c5c83321c66889a5bf0221f0502058925808e315ce50a1/pdfannot-0.0.13.tar.gz" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "13b4fea05adca2de4a18e7518c245c96", "sha256": "3c36abe7d1d182e3dcd22dfee5147c777913b7734008d2c25108f32b45a756b0" }, "downloads": -1, "filename": "pdfannot-0.0.2.tar.gz", "has_sig": false, "md5_digest": "13b4fea05adca2de4a18e7518c245c96", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 359576, "upload_time": "2019-04-23T08:28:32", "url": "https://files.pythonhosted.org/packages/d5/56/52aa5d1aae35c2e143d312f9e5819276065d7ae3a0743d1aad6b9fa55760/pdfannot-0.0.2.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "c2844b2c163ae65e419b86c4502c7aa3", "sha256": "2f30ac12e07081659c7921da73ab6b48b9cd1d7497641d666f5d6aa1c49471e2" }, "downloads": -1, "filename": "pdfannot-0.0.3.tar.gz", "has_sig": false, "md5_digest": "c2844b2c163ae65e419b86c4502c7aa3", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 359599, "upload_time": "2019-04-23T14:49:47", "url": "https://files.pythonhosted.org/packages/2b/5c/b6c2a40997b6cce5dac561af1c1540a640babc964aa0c94134bfd429a1f3/pdfannot-0.0.3.tar.gz" } ], "0.0.4": [ { "comment_text": "", "digests": { "md5": "c2e7f9d2d1518daacfdb42c3fead46f9", "sha256": "e851508ade80a2153e28ebea9fc502924b9481d75c6aef36f06d4bd005dfcf39" }, "downloads": -1, "filename": "pdfannot-0.0.4.tar.gz", "has_sig": false, "md5_digest": "c2e7f9d2d1518daacfdb42c3fead46f9", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 359606, "upload_time": "2019-04-23T15:13:01", "url": "https://files.pythonhosted.org/packages/99/a5/b9c76ca14f34c900a218672815d678b5096555bcb31414ccae041cff5467/pdfannot-0.0.4.tar.gz" } ], "0.0.5": [ { "comment_text": "", "digests": { "md5": "186aeff3ebe05bfab9eab74ecd059889", "sha256": "cf1af640db76372368895dc3a4875f00680b7f8aed297475925613769dfbdbdf" }, "downloads": -1, "filename": "pdfannot-0.0.5.tar.gz", "has_sig": false, "md5_digest": "186aeff3ebe05bfab9eab74ecd059889", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 359919, "upload_time": "2019-04-23T16:36:37", "url": "https://files.pythonhosted.org/packages/71/5e/92139570266ed500895962aff69314e1bad4ef3b5c3d4ecd6ad9fd3bb8cd/pdfannot-0.0.5.tar.gz" } ], "0.0.6": [ { "comment_text": "", "digests": { "md5": "941f016894ea2bb1ae55b854efcec5cf", "sha256": "b7d9f19e16a387dc931d7adaaa5efd11b5f860450c8f492277a4c5da6297a651" }, "downloads": -1, "filename": "pdfannot-0.0.6.tar.gz", "has_sig": false, "md5_digest": "941f016894ea2bb1ae55b854efcec5cf", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 359915, "upload_time": "2019-04-24T12:19:39", "url": "https://files.pythonhosted.org/packages/24/c3/a47c0ef9db4f75d6ad89bf49b9556112d0d9bc6ef156631f893714d804ac/pdfannot-0.0.6.tar.gz" } ], "0.0.7": [ { "comment_text": "", "digests": { "md5": "b0a4554e5c8fa57f25bf0c60518cb2fe", "sha256": "807b58c9b466b1cc68850fdb9637fc7e55415decbbb809250e1b94da23ea0048" }, "downloads": -1, "filename": "pdfannot-0.0.7.tar.gz", "has_sig": false, "md5_digest": "b0a4554e5c8fa57f25bf0c60518cb2fe", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 360310, "upload_time": "2019-05-09T08:34:15", "url": "https://files.pythonhosted.org/packages/3e/15/6e121335997277b4386461d53857fb99f67488dc468b72c6d4949a6c390d/pdfannot-0.0.7.tar.gz" } ], "0.0.8": [ { "comment_text": "", "digests": { "md5": "97acd648bbcb9dc71f64ae5084aba373", "sha256": "1365bbb3417e963f7f49d9803140aa0701470280e593a73e5d4846d682fb1a86" }, "downloads": -1, "filename": "pdfannot-0.0.8.tar.gz", "has_sig": false, "md5_digest": "97acd648bbcb9dc71f64ae5084aba373", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 360306, "upload_time": "2019-05-09T08:45:40", "url": "https://files.pythonhosted.org/packages/93/35/c6bfc21aa383533c5dc10903e5cc5af5d3b4ae637267dbb4c7d9e093835c/pdfannot-0.0.8.tar.gz" } ], "0.0.9": [ { "comment_text": "", "digests": { "md5": "c40078e46e7d46f902271a8cb3843c56", "sha256": "6f242ef0f6676dc0e596783067e96a91bf06986db7915b7ae0cfcfaae25fa6f1" }, "downloads": -1, "filename": "pdfannot-0.0.9.tar.gz", "has_sig": false, "md5_digest": "c40078e46e7d46f902271a8cb3843c56", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 360707, "upload_time": "2019-05-10T14:53:22", "url": "https://files.pythonhosted.org/packages/22/12/5023ddb34f4a738582f737f221e09660d2347a70a0fe3a8ade940189fb9e/pdfannot-0.0.9.tar.gz" } ], "2019.5.27.1": [ { "comment_text": "", "digests": { "md5": "76572027cd5c472a09b5c5bb5c20ec41", "sha256": "c6c2b29f2129e06a27e48b29590eb9567270126b001f15d1f05f6ac376313ecb" }, "downloads": -1, "filename": "pdfannot-2019.5.27.1.tar.gz", "has_sig": false, "md5_digest": "76572027cd5c472a09b5c5bb5c20ec41", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 36307, "upload_time": "2019-05-27T14:00:33", "url": "https://files.pythonhosted.org/packages/29/55/ba4c81e779d48b39ae185b732d3747feaac43451a882badbf15266d70649/pdfannot-2019.5.27.1.tar.gz" } ], "2019.6.5.1": [ { "comment_text": "", "digests": { "md5": "cbf44923bf44024dd99f4168b8acac32", "sha256": "46e92e35fabd52c82cedb043d361845955daf6eac75c00759ad6ac3e90689fb2" }, "downloads": -1, "filename": "pdfannot-2019.6.5.1.tar.gz", "has_sig": false, "md5_digest": "cbf44923bf44024dd99f4168b8acac32", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 737441, "upload_time": "2019-06-05T13:04:44", "url": "https://files.pythonhosted.org/packages/1a/b2/0721f54aeae00a10526a9847cb7693a0a02dda2d55f0da5cc79925616422/pdfannot-2019.6.5.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "cbf44923bf44024dd99f4168b8acac32", "sha256": "46e92e35fabd52c82cedb043d361845955daf6eac75c00759ad6ac3e90689fb2" }, "downloads": -1, "filename": "pdfannot-2019.6.5.1.tar.gz", "has_sig": false, "md5_digest": "cbf44923bf44024dd99f4168b8acac32", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 737441, "upload_time": "2019-06-05T13:04:44", "url": "https://files.pythonhosted.org/packages/1a/b2/0721f54aeae00a10526a9847cb7693a0a02dda2d55f0da5cc79925616422/pdfannot-2019.6.5.1.tar.gz" } ] }