{ "info": { "author": "Xinyu Zhou", "author_email": "zxytim@gmail.com", "bugtrack_url": null, "classifiers": [ "Intended Audience :: Developers", "Intended Audience :: Science/Research", "License :: OSI Approved :: BSD License", "Programming Language :: Python :: 3", "Topic :: Software Development :: Libraries :: Python Modules" ], "description": "# pdf2images\nConvert PDF file to image files **ROBUSTLY**.\n\n# Example\n```\n$ pdf2images -h\nusage: pdf2images [-h] [--max-size MAX_SIZE] pdf_file output_dir\n\npositional arguments:\n pdf_file\n output_dir\n\noptional arguments:\n -h, --help show this help message and exit\n --max-size MAX_SIZE max size of either side of the image\n```\n\n# Why another \"pdf-to-image\" package\nOnce in a while, I need to convert a pdf file (usually slides or academic\npaper) into image files (thumbnails) in order to get a fast glance to the\nreaders without downloading the pdf file.\n\nHowever, I found all the pdf2image solutions cannot robustly process all the\npdf files, since many pdf files are in non-standard format or come up with\nextensions. They are always broken in some cases.\n\nBut to look them on the bright side, for any plausible case, there is almost\none of them can process it successfully. \n\nSo I combined (a.k.a. *ensemble*) them together to make it work across most cases.\n\n# Installation\nAs mentioned above, we combined multiple pdf manipulation libraries. Here are\nthe list of the libraries used:\n- [wand](http://docs.wand-py.org), an ImageMagick python wrapper.\n- `pdftotext` command line tool provided by [xpdf](http://www.xpdfreader.com/)\n- [preview-generator](https://github.com/algoo/preview-generator)\n- [qpdf](https://github.com/qpdf/qpdf)\n\nwhere wand and preview-generator are python packages that can be automatically\ninstalled along with pdf2images. However, you have to install xpdf and qpdf\nmanually.\n\nOn Ubuntu:\n```\nsudo apt install -y qpdf xpdf libimage-exiftool-perl\n```\n\nOn Arch Linux:\n```\nsudo pacman -S --noconfirm qpdf xpdf perl-image-exiftool\n```\n\nThe installation of pdf2images is quite simple:\n```\npip install pdf2images\n```\n\n# Robustness\nThis package has successfully processed hundreds of thousands of arxiv papers\n(for generating thumbnails).\n\n\n# Gallary\nThe following images are converted from a [slide](https://www.deeplearningbook.org/slides/02_linear_algebra.pdf) from [Deep Learning Book](https://www.deeplearningbook.org/lecture_slides.html)\n\n![page-0](assets/0.png)\n![page-1](assets/1.png)\n![page-2](assets/2.png)\n![page-3](assets/3.png)\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/zxytim/pdf2images", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "pdf2images", "package_url": "https://pypi.org/project/pdf2images/", "platform": "", "project_url": "https://pypi.org/project/pdf2images/", "project_urls": { "Homepage": "https://github.com/zxytim/pdf2images" }, "release_url": "https://pypi.org/project/pdf2images/0.0.3/", "requires_dist": [ "preview-generator", "Wand", "plumbum", "tqdm" ], "requires_python": ">=3.5", "summary": "Convert PDF file to image files ROBUSTLY.", "version": "0.0.3" }, "last_serial": 5818989, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "73fdc0866b323b3f794e5f3f9123ba8f", "sha256": "65701ac6e3ca6f4b4739a099d9e8074941f748419a58bfd28f3d5a3033267339" }, "downloads": -1, "filename": "pdf2images-0.0.1-py3-none-any.whl", "has_sig": false, "md5_digest": "73fdc0866b323b3f794e5f3f9123ba8f", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5", "size": 3142, "upload_time": "2019-07-26T17:36:51", "url": "https://files.pythonhosted.org/packages/f1/12/f6595f420140b92f6562934f54392b583cfeb3a7d1240b4539fd23d7f87e/pdf2images-0.0.1-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "8844ada5f1c51846ac7281a9c6a4c151", "sha256": "b29445d3bfafbcfaa6006c248ec6b9605999003ae5e367ff04b7a14044df0c82" }, "downloads": -1, "filename": "pdf2images-0.0.1.tar.gz", "has_sig": false, "md5_digest": "8844ada5f1c51846ac7281a9c6a4c151", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 2221, "upload_time": "2019-07-26T17:36:53", "url": "https://files.pythonhosted.org/packages/7a/51/8f8a7b63ffbde770db49ac290b277817ad556dcef17e97341270e4106235/pdf2images-0.0.1.tar.gz" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "b07cdecf130cc0da40c735c21ab78f38", "sha256": "4240f8b31eb74f9ca6c4375073d210037b443018577b78db6357ca58536ffb10" }, "downloads": -1, "filename": "pdf2images-0.0.2-py3-none-any.whl", "has_sig": false, "md5_digest": "b07cdecf130cc0da40c735c21ab78f38", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5", "size": 6053, "upload_time": "2019-07-26T17:49:09", "url": "https://files.pythonhosted.org/packages/a4/8b/849d35007a54a5451ffa9b9fa57afffd1421c75a0d89c5bca2b449309429/pdf2images-0.0.2-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "628d8011c4006353e6d6752a7633741f", "sha256": "60a196d3ffe30375c34467eca3fbe3a7b81ee16d7b8528a6663e82053ab56381" }, "downloads": -1, "filename": "pdf2images-0.0.2.tar.gz", "has_sig": false, "md5_digest": "628d8011c4006353e6d6752a7633741f", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 4623, "upload_time": "2019-07-26T17:49:11", "url": "https://files.pythonhosted.org/packages/96/ce/4b312d4953f23fefb40c4122a2a92b1517fabbc014b20dd7fdf26ac64749/pdf2images-0.0.2.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "80808d8c51647fae9ecfc9ee07c72701", "sha256": "718b971a2c36c3d6ae746a77c63458c277f7ed8196d7528fbd6538a638b22017" }, "downloads": -1, "filename": "pdf2images-0.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "80808d8c51647fae9ecfc9ee07c72701", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5", "size": 7250, "upload_time": "2019-09-12T08:46:21", "url": "https://files.pythonhosted.org/packages/63/80/865a71de7d31c6d4aa896a9ecb6848f3802296c81f6de1eda1b840012bc3/pdf2images-0.0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "f7c656abcdf96fa63dfce82efbf4ab87", "sha256": "b7336938dc289786d6b3b5cf8fd9341f258daa0b4ca4796c7fdc37b5e3de6f45" }, "downloads": -1, "filename": "pdf2images-0.0.3.tar.gz", "has_sig": false, "md5_digest": "f7c656abcdf96fa63dfce82efbf4ab87", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 5353, "upload_time": "2019-09-12T08:46:23", "url": "https://files.pythonhosted.org/packages/08/7a/9aa678ede12f8f3cb6c43068459e326bdb8b0ddd080edbbbebd51f6f9926/pdf2images-0.0.3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "80808d8c51647fae9ecfc9ee07c72701", "sha256": "718b971a2c36c3d6ae746a77c63458c277f7ed8196d7528fbd6538a638b22017" }, "downloads": -1, "filename": "pdf2images-0.0.3-py3-none-any.whl", "has_sig": false, "md5_digest": "80808d8c51647fae9ecfc9ee07c72701", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": ">=3.5", "size": 7250, "upload_time": "2019-09-12T08:46:21", "url": "https://files.pythonhosted.org/packages/63/80/865a71de7d31c6d4aa896a9ecb6848f3802296c81f6de1eda1b840012bc3/pdf2images-0.0.3-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "f7c656abcdf96fa63dfce82efbf4ab87", "sha256": "b7336938dc289786d6b3b5cf8fd9341f258daa0b4ca4796c7fdc37b5e3de6f45" }, "downloads": -1, "filename": "pdf2images-0.0.3.tar.gz", "has_sig": false, "md5_digest": "f7c656abcdf96fa63dfce82efbf4ab87", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.5", "size": 5353, "upload_time": "2019-09-12T08:46:23", "url": "https://files.pythonhosted.org/packages/08/7a/9aa678ede12f8f3cb6c43068459e326bdb8b0ddd080edbbbebd51f6f9926/pdf2images-0.0.3.tar.gz" } ] }