{ "info": { "author": "NVRAM", "author_email": "nvram@users.sourceforge.net", "bugtrack_url": null, "classifiers": [ "Development Status :: 5 - Production/Stable", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Topic :: Software Development :: Libraries :: Python Modules", "Topic :: System :: Archiving :: Packaging", "Topic :: Utilities" ], "description": "tarwalker 1.1\n=============\n\nSummary\n-------\n \n*TarWalker* provides a method to easily scan files somewhat like\n`os.walk `_,\nhandling compressed files, recursing through directories and scanning\nwithin `tarfiles `_.\n\nThe library is very stable, changes are rare. It well documented and\nhas full unit testing (100% code coverage), and is maintained.\n\n\n.. contents:: **Index**\n :depth: 2\n :local:\n\n----------\n\nBuild Status\n------------\n\n.. image:: https://gitlab.com/n2vram/tarwalker/badges/master/build.svg\n :alt: Build Status\n :target: https://gitlab.com/n2vram/tarwalker/\n\n\nOverview\n--------\n\nNotes about this library:\n\n1. It can walk through compressed or uncompressed files and `tarfiles\n `_ (and optionally\n directories), processing the target files one at a time.\n\n2. It uses a pair of callbacks to avoid opening or decompressing files\n that do are not of interest:\n\n a. The **matcher** is called first with meta data of the file, and\n returns true if the file is to be used.\n\n b. If so, the **handler** is called with the meta data, the **matcher**\n return value, and an open (possibly decompressed) file handle.\n\n3. Decompression is done on the stream, to reduce memory requirements\n and to avoid wasted processing when a **handler** returns early.\n\n4. If the **recurse** parameter is true and the walker encounters a\n `tarfile` embedded within a `tarfile`, its contents will also be\n scanned the same way.\n\nThere are two (2) classes that are provided. The primary difference\nis that **TarWalker** will throw an exception if given a directory.\n\n- **TarWalker** handles compressed or uncompressed files and `tarfile`\n archives.\n\n- **TarDirWalker** is a subclass of **TarWalker** that expands it to\n recursively walk through directories, processing any files\n encountered.\n\n\nInstallation\n------------\nInstall the package using **pip**, eg:\n\n pip install --user tarwalker\n\n pip3 install --user tarwalker\n\n\nExamples\n--------\n\nThe following is simple tool to look for a given string within files.\nFiles can be given as arguments or within tarballs, and must end with\neither '.log' (w/an optional numeric suffix) or with '.txt':\n\n.. code:: python\n\n import re\n import sys\n\n from tarwalker import TarWalker\n\n PATTERN = re.compile(r'.*\\.(txt|log(\\.\\d+)?)$')\n\n\n def handler(fileobj, filename, arch, info, match):\n try:\n for line in fileobj:\n if text in line:\n path = (arch + ':') if arch else ''\n print(\"Found in: \" + path + filename)\n return\n except IOError:\n pass\n\n\n text = sys.argv[1]\n walker = TarWalker(file_handler=handler, name_matcher=PATTERN.match, recurse=False)\n\n for arg in sys.argv[2:]:\n walker.handle_path(arg)\n \n\n\nConstructors and Callbacks\n--------------------------\n\nConstructing an instance of *TarWalker* or *TarDirWalker* take the\nsame parameters. Note that at most one of *file_matcher* or\n*name_matcher* is allowed.\n\n* **file_handler** (required) a callable taking five (5) positional parameters:\n\n * **fileobj** - a readable `file` object for the file contents.\n * **filepath** - a `str` with the filename, either as one of:\n\n * the file path given to *handle_path()*, or\n * the path of a file found beneath a directory given to *handle_path()*.\n * the file path of a file within an expanded tar archive.\n\n * **archname** - a `str` path of the tar archive name, when handling a\n file found within a tar archive. It will be a colon (':')\n separated list if reading a recursive tar archive.\n\n * **fileinfo** - may be `None` or an object with the following\n attributes. See `os.stat\n `_ for more\n details:\n\n * **name** - the `str` name of the file,\n * **size** - the size of the file in bytes,\n * **mtime** - modification time, in POSIX (epoch) time,\n * **mode** - the file permission bits,\n * **uid** - the file owner's User ID, and\n * **gid** - the file owner's Group ID\n\n * MATCH - the value returned from the `name_matcher` or `file_matcher` call.\n\n **NOTE:** files with a compression suffix will have the suffix\n removed, and the file object will return decompressed contents.\n *For example*, for \"foo.txt.gz\" `filepath` would be \"foo.txt\" and `fileobj`\n would be the equivalent contents of \"foo.txt\".\n\n* **file_matcher** (optional) a callable that takes two (2) positional\n parameters and returns true if the file should be opened and\n passed to the `file_handler` callback:\n\n * **filepath** - See `filepath` above.\n * **fileinfo** - See `fileinfo` above.\n\n* **name_matcher** (optional) a callable that takes one (1) positional\n parameter and returns true if the file be opened and passed to\n `file_handler`:\n\n * **filepath** - See `file_handler`, above.\n\n* *recurse* (optional) If true, the algorithm will recurse into\n tarballs found within other tarballs. Furthermore, if `recurse` is a\n callable it will be called before and after opening an interior\n tarball, with four (4) positional parameters:\n\n * **start** - a bool that indicates recursion into the given tarball\n is starting; it is False on the second call.\n * **tarname** - name of the contained (interior) tarball, see `filepath` above.\n * **archive** - the name of the containing (exterior) tarball, see `archname` above.\n * **fileinfo** - See `fileinfo` above.\n\n\nKnown Issues\n------------\n\nIf you think you have found a defect, or wish to add an enhancement\nrequest, please do so via the `GitLab issues page:\n`_.\n\n- The ARCHNAME passed to the *file_handler* callback uses ':' as a\n separator, which is a legal filename component, so does not\n necessarily indicate a nested archive.\n\n- The *recurse* feature will scan an embedded `tarfile`, but there is\n currently no mechanism to avoid scanning a `tarfile` found within an\n embedded `tarfile` (at any level). If needed, please submit an\n enhancement request.\n\n- There are lots of other compression algorithms that are not handled.", "description_content_type": "", "docs_url": null, "download_url": "https://gitlab.com/n2vram/tarwalker/archive/1.1", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://gitlab.com/n2vram/tarwalker", "keywords": "tarfile,gzip,streaming", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "tarwalker", "package_url": "https://pypi.org/project/tarwalker/", "platform": "any", "project_url": "https://pypi.org/project/tarwalker/", "project_urls": { "Download": "https://gitlab.com/n2vram/tarwalker/archive/1.1", "Homepage": "https://gitlab.com/n2vram/tarwalker" }, "release_url": "https://pypi.org/project/tarwalker/1.1/", "requires_dist": null, "requires_python": "", "summary": "A library to walk through tar archives, simplifying use by handling listing and decompression.", "version": "1.1" }, "last_serial": 4275806, "releases": { "0.1": [], "1.0": [ { "comment_text": "", "digests": { "md5": "aa3e62b670f68284cbff3eceeed95f88", "sha256": "b96832839689f6f9e9531b564fd1a562312aea773dc39df8b3c1d1fa873cbdf1" }, "downloads": -1, "filename": "tarwalker-1.0.tar.gz", "has_sig": false, "md5_digest": "aa3e62b670f68284cbff3eceeed95f88", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 6940, "upload_time": "2017-04-29T22:21:30", "url": "https://files.pythonhosted.org/packages/fa/d0/185b054e263889672ea0abdb18aac066d66e5c64e957e681dc788e5bce1d/tarwalker-1.0.tar.gz" } ], "1.1": [ { "comment_text": "", "digests": { "md5": "10b28a3f25a7f1410e902ac228d545f3", "sha256": "bb103a0cc3717bb9083f24a2ccd13a49641a0ee5ec2ea80ee5d42a2803516315" }, "downloads": -1, "filename": "tarwalker-1.1.tar.gz", "has_sig": false, "md5_digest": "10b28a3f25a7f1410e902ac228d545f3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7905, "upload_time": "2018-09-16T03:08:23", "url": "https://files.pythonhosted.org/packages/84/e3/5282d6ee7d50de20f1c6fa8221bab06a020036bab5ddeb3de6cbee069296/tarwalker-1.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "10b28a3f25a7f1410e902ac228d545f3", "sha256": "bb103a0cc3717bb9083f24a2ccd13a49641a0ee5ec2ea80ee5d42a2803516315" }, "downloads": -1, "filename": "tarwalker-1.1.tar.gz", "has_sig": false, "md5_digest": "10b28a3f25a7f1410e902ac228d545f3", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 7905, "upload_time": "2018-09-16T03:08:23", "url": "https://files.pythonhosted.org/packages/84/e3/5282d6ee7d50de20f1c6fa8221bab06a020036bab5ddeb3de6cbee069296/tarwalker-1.1.tar.gz" } ] }