{ "info": { "author": "Markus Ressel", "author_email": "mail@markusressel.de", "bugtrack_url": null, "classifiers": [ "Development Status :: 5 - Production/Stable", "License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7" ], "description": "# py-image-dedup [![Build Status](https://travis-ci.org/markusressel/py-image-dedup.svg?branch=master)](https://travis-ci.org/markusressel/gopass-chrome-importer) [![PyPI version](https://badge.fury.io/py/py-image-dedup.svg)](https://badge.fury.io/py/py-image-dedup)\n\n**py-image-dedup** is a tool to scan through a library of photos, find duplicates and remove them\nin a prioritized way.\n\nIt is build upon [Image-Match](https://github.com/ascribe/image-match) a very popular library to compute\na pHash for an image and store the result in an ElasticSearch backend for very high scalability.\n\n[![asciicast](https://asciinema.org/a/3WbBxMXnZyT1QnuTP9fm37wkS.svg)](https://asciinema.org/a/3WbBxMXnZyT1QnuTP9fm37wkS)\n\n# How to use\n\n## Setup elasticsearch backend\n\n### Elasticsearch version\n\nThis library requires elasticsearch version 5 or later. Sadly the\n[Image-Match](https://github.com/ascribe/image-match) library \nspecifies version 2 for no apparent reason, so you have to remove this\nrequirement from it's requirements.\n\nBecause of this **py-image-dedup** will exit with an **error on first install**.\n\nTo fix this find the installed files of the image-match library, f.ex.\n\n```\n../venv/lib/python3.6/site-packages/image_match-1.1.2-py3.6.egg-info/requires.txt \n```\n\nand remove the second line\n```\nelasticsearch<2.4,>=2.3\n```\n\nfrom the file. \nAfter that **py-image-dedup** should install and run as expected.\n\n### Set up the index\n\nSince this library is based on [Image-Match](https://github.com/ascribe/image-match) \nyou need a running elasticsearch instance for efficient storing and \nquerying of image signatures.\n\n**py-image-dedup** uses a single index called `images` that you can create using the following command:\n\n```shell\ncurl -X PUT \"192.168.2.24:9200/images?pretty\" -H \"Content-Type: application/json\" -d \"\n{\n \\\"mappings\\\": {\n \\\"image\\\": {\n \\\"properties\\\": {\n \\\"path\\\": {\n \\\"type\\\": \\\"keyword\\\",\n \\\"ignore_above\\\": 256\n }\n }\n }\n }\n}\n```\n\n## Configuration\n\n**py-image-dedup** offers customization options to make sure it can \ndetect the best image with the highest probability possible.\n\n| Name | Description | Default |\n|------|-------------|---------|\n| threads | Number of threads to use for image analysis | `2` |\n| recursive | Toggle to analyse given directories recursively | `False` |\n| search_across_dirs | Toggle to allow duplicate results across given directories | `False` |\n| file_extensions | Comma separated list of file extensions to analyse | `\"png,jpg,jpeg\"` |\n| max_dist | Maximum distance of image signatures to consider. This is a value in the range [0..1] | `0.1` |\n\n## Command line usage\n\n**py-image-dedup** can be used from the command line like this:\n\n```shell\npy-image-dedup deduplicate --help\n```\n\nHave a look at the help output to see how you can customize it.\n\n## Dry run\n\nTo analyze images and get an overview of what images would be deleted \nbe sure to make a dry run first.\n\n```shell\npy-image-dedup -d \"/home/mydir\" --dry-run\n```\n\n# Contributing\n\nGitHub is for social coding: if you want to write code, I encourage contributions through pull requests from forks\nof this repository. Create GitHub tickets for bugs and new features and comment on the ones that you are interested in.\n\n# License\n\n```\npy-image-dedup by Markus Ressel\nCopyright (C) 2018 Markus Ressel\n\nThis program is free software: you can redistribute it and/or modify\nit under the terms of the GNU General Public License as published by\nthe Free Software Foundation, either version 3 of the License, or\n(at your option) any later version.\n\nThis program is distributed in the hope that it will be useful,\nbut WITHOUT ANY WARRANTY; without even the implied warranty of\nMERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the\nGNU General Public License for more details.\n\nYou should have received a copy of the GNU General Public License\nalong with this program. If not, see .\n```\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/markusressel/py-image-dedup", "keywords": "", "license": "GPLv3+", "maintainer": "", "maintainer_email": "", "name": "py-image-dedup", "package_url": "https://pypi.org/project/py-image-dedup/", "platform": "", "project_url": "https://pypi.org/project/py-image-dedup/", "project_urls": { "Homepage": "https://github.com/markusressel/py-image-dedup" }, "release_url": "https://pypi.org/project/py-image-dedup/1.0.0/", "requires_dist": [ "setuptools", "scipy", "numpy", "image-match", "elasticsearch (==6.3.1)", "tabulate", "tqdm", "click", "Pillow" ], "requires_python": "", "summary": "A library to find duplicate images and delete unwanted ones", "version": "1.0.0" }, "last_serial": 4913971, "releases": { "1.0.0": [ { "comment_text": "", "digests": { "md5": "b2bb4d0a4d14495b2f29b1a6c100306a", "sha256": "ce42bb325763f86e888e1357f4abc27f613b0289d39c6d215f1ce7c7bbf8eb51" }, "downloads": -1, "filename": "py_image_dedup-1.0.0-py3-none-any.whl", "has_sig": false, "md5_digest": "b2bb4d0a4d14495b2f29b1a6c100306a", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 20718, "upload_time": "2019-03-08T05:43:54", "url": "https://files.pythonhosted.org/packages/59/30/fe496043acca2fc0a88317cab359aba23e3af8ed946a240cd03324fe6c07/py_image_dedup-1.0.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "45ad918f682cc9f8b0d88a0d89e24ba4", "sha256": "31043395e5ed9bb035fffbb666f027502e356801ccde48861d020e345d69cf2c" }, "downloads": -1, "filename": "py-image-dedup-1.0.0.tar.gz", "has_sig": false, "md5_digest": "45ad918f682cc9f8b0d88a0d89e24ba4", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 18012, "upload_time": "2019-03-08T05:43:57", "url": "https://files.pythonhosted.org/packages/35/c0/8e9362cd9e77ac98b94fdc748da0c4e89a12b0aa9a9fc9c7de3cba342f58/py-image-dedup-1.0.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "b2bb4d0a4d14495b2f29b1a6c100306a", "sha256": "ce42bb325763f86e888e1357f4abc27f613b0289d39c6d215f1ce7c7bbf8eb51" }, "downloads": -1, "filename": "py_image_dedup-1.0.0-py3-none-any.whl", "has_sig": false, "md5_digest": "b2bb4d0a4d14495b2f29b1a6c100306a", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 20718, "upload_time": "2019-03-08T05:43:54", "url": "https://files.pythonhosted.org/packages/59/30/fe496043acca2fc0a88317cab359aba23e3af8ed946a240cd03324fe6c07/py_image_dedup-1.0.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "45ad918f682cc9f8b0d88a0d89e24ba4", "sha256": "31043395e5ed9bb035fffbb666f027502e356801ccde48861d020e345d69cf2c" }, "downloads": -1, "filename": "py-image-dedup-1.0.0.tar.gz", "has_sig": false, "md5_digest": "45ad918f682cc9f8b0d88a0d89e24ba4", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 18012, "upload_time": "2019-03-08T05:43:57", "url": "https://files.pythonhosted.org/packages/35/c0/8e9362cd9e77ac98b94fdc748da0c4e89a12b0aa9a9fc9c7de3cba342f58/py-image-dedup-1.0.0.tar.gz" } ] }