{ "info": { "author": "eyeo GmbH", "author_email": "info@adblockplus.org", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Intended Audience :: Developers", "License :: OSI Approved :: GNU General Public License v3 (GPLv3)", "Natural Language :: English", "Operating System :: OS Independent", "Programming Language :: Python :: 3", "Topic :: Utilities" ], "description": "# AdMincer\n\nAdMincer is a command line tool for enriching datasets of screenshots used in\nML-based ad detection. It can probably be used with other object-detection\ndatasets, but ad detection is the main use case we're after.\n\n## Usage\n\nFrom the command line, run `$ admincer ` to get information on command\noptions and usage. At the moment only one command is available: `place`.\n\n### Place\n\nThis command places fragment images into the regions of source images. It takes\na directory with source images that have regions marked on them and multiple\nmappings of region type to fragment directory:\n\n $ admincer place -f ad=ads/dir -f label=labels:other/labels -n 5 source target\n\nThis will take images with marked regions from `source/`, place images from\n`ads/dir/` into the regions of type `ad` and images from `labels/` and\n`other/labels/` into the `label` regions. It will generate 5 images and store\nthem in `target/`.\n\nThe placements are performed in the order of region types on the command line.\nIn the example above first all `ad` regions will be placed and then all `label`\nregions.\n\n#### Region marking\n\nRegions of the images can be defined via a CSV file in the following format\n(the numbers are X and Y coordinates of the top left corner followed by the\nbottom right corner):\n\n image,xmin,ymin,xmax,ymax,label\n image1.png,50,50,80,90,region_type1\n image2.gif,10,10,20,20,region_type2\n\nThey can also be defined via TXT files of the same name as the image. The TXT\nfiles should be in the format commonly used with YOLO object detector (the\nnumbers are region type number followed by coordinates of the center of the\nregion and then its width and height, coordinates and sizes are rescaled so\nthe full image is 1x1):\n\n 0 0.075 0.15 0.05 0.1\n 1 0.225 0.15 0.05 0.1\n\nIt's possible to provide names for the region type numbers via placing a file\nwith `.names` extension into the directory. It should simply contian the names\nin the successive lines:\n\n region_type1\n region_type2\n\nWhen the names file is provided it's also possible to mix CSV and TXT region\ndefinitions but not for the same image.\n\nNote: regions that extend beyond the boundaries of the image will be clipped.\n\n#### Resize modes\n\nWhen the fragments placed into the regions are not of the same size as the\nregions, there are several possible options for resizing them. The default\nis to scale the fragment to match the size of the region. Another option is to\ncut off the part of the fragment that doesn't fit and place the rest into the\npart of the region that it would cover. Yet another approach is to cut off some\nparts and pad the remaining image to the size of the region. These modes are\ncalled `scale`, `crop` and `pad` respectively and they can be configured via\n`--resize-mode` command line option. Example: \n\n $ admincer place -f ad=ads/dir -f label=labels -r pad -r label=crop ...\n\nHere the first `-r` sets the default resize mode and the second one overrides\nit for `label` region type.\n\n### Extract\n\nThis command extracts the contents of marked regions from source images. It\ntakes a directory with source images with marked regions (see above) and\nmultiple mappings of region type to target directory:\n\n $ admincer extract --target-dir ad=ads/dir -t label=labels source\n\nThis will load the images and region maps from `source` and will extract the\ncontents of the regions labeled `ad` and `label` into `ads/dir` and `labels`\ndirectories respectively.\n\n### Find\n\nThis command finds source images that have regions of specific types and sizes.\nFor example the following command will find all images in `source` directory\nthat have regions of the type `ad` 100 pixels wide by 50 pixels high.\n\n $ admincer find --region=ad:100x50 source\n\nThere's certain tolerance for size mismatches. Normally it's +25% and -20%.\nTolerance can be configured via an additional parameter of the region query:\n\n $ admincer find -r ad:100x50:100 source\n\nHere height and width can be up to 100% larger and up to 50% smaller. In\ngeneral the tolerance value X allows the region to be X% larger than specified\nor the specification to be X% larger than the region.\n\nMultiple `--region`/`-r` options can be given. In this case images that contain\nat least one region matching each query will be found (i.e. multiple queries\nare combined using `and` operator).\n\n## Questions\n\n- Fragment matching policy (current one allows scaling by 80% to 125%).\n- What to do if there are no valid candidate fragments for placement? Right now\n we bomb out with an exception.\n- Do we want sampling with/without replacement? Or maybe some kind of\n deterministic selection? Right now it's with replacement.", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://gitlab.com/eyeo/machine-learning/admincer/", "keywords": "ad-detection machine-learning dataset", "license": "GPLv3", "maintainer": "", "maintainer_email": "", "name": "admincer", "package_url": "https://pypi.org/project/admincer/", "platform": "", "project_url": "https://pypi.org/project/admincer/", "project_urls": { "Homepage": "https://gitlab.com/eyeo/machine-learning/admincer/" }, "release_url": "https://pypi.org/project/admincer/1.0.0/", "requires_dist": null, "requires_python": "", "summary": "Tool for managing datasets for visual ad detection", "version": "1.0.0" }, "last_serial": 5890815, "releases": { "1.0.0": [ { "comment_text": "", "digests": { "md5": "3d47ff990f172e14acbb4efbb18e8cb6", "sha256": "42c451c6a2b3ab7d840eb3cdf1b3fa8524ae68203a29d8e670e0831ccaaa0089" }, "downloads": -1, "filename": "admincer-1.0.0.tar.gz", "has_sig": false, "md5_digest": "3d47ff990f172e14acbb4efbb18e8cb6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 46947, "upload_time": "2019-09-26T14:08:17", "url": "https://files.pythonhosted.org/packages/34/fa/1fda0068c30e9fc9a48f362d699d5760ef00fc946c963bdae5d7fea1f7de/admincer-1.0.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "3d47ff990f172e14acbb4efbb18e8cb6", "sha256": "42c451c6a2b3ab7d840eb3cdf1b3fa8524ae68203a29d8e670e0831ccaaa0089" }, "downloads": -1, "filename": "admincer-1.0.0.tar.gz", "has_sig": false, "md5_digest": "3d47ff990f172e14acbb4efbb18e8cb6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 46947, "upload_time": "2019-09-26T14:08:17", "url": "https://files.pythonhosted.org/packages/34/fa/1fda0068c30e9fc9a48f362d699d5760ef00fc946c963bdae5d7fea1f7de/admincer-1.0.0.tar.gz" } ] }