{ "info": { "author": "Nico Coetzee", "author_email": "nicc777@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 4 - Beta", "Environment :: Console", "Environment :: MacOS X", "Environment :: Win32 (MS Windows)", "Intended Audience :: End Users/Desktop", "License :: OSI Approved :: MIT License", "Programming Language :: Python :: 3.7", "Programming Language :: Python :: 3.8", "Topic :: System :: Archiving", "Topic :: System :: Filesystems", "Topic :: Utilities" ], "description": "# py_workdocs_prep\n\nA bulk directory and file renaming utility to prepare files for migration to [AWS WorkDocs](https://aws.amazon.com/workdocs/)\n\nIf you run the script, it will start to traverse the current directory and will do one of the following with each file and directory:\n\n* Keep as is\n* Rename\n* Delete\n\nAll actions taken will be written out to STDOUT after all operations is completed\n\n**WARNING** The actions will make changes to your directories and/or files. It is *HIGHLY RECOMMENDED* you first do a full backup of your data.\n\nThis project was a result of me migrating from Dropbox to AWS Workdocs and finding a lot issues due to the names of files and/or directories that were invalid in AWS Workdocs.\n\nFor details of this potential problem, refer to the [AWS Workdocs Administration Guide](https://docs.aws.amazon.com/workdocs/latest/adminguide/prepare.html)\n\nHere is the most important limitations as of 2019-10-26:\n\n* Amazon WorkDocs Drive displays only files with a full directory path of 260 characters or fewer\n* Invalid characters in names:\n * Trailing spaces\n * Periods at the beginning or end\u2013For example: `.file`, `.file.ppt`, `.`, `..`, or `file.`\n * Tildes at the beginning or end\u2013For example: `file.doc~`, `~file.doc`, or `~$file.doc`\n * File names ending in .tmp\u2013For example: `file.tmp`\n * File names exactly matching these case-sensitive terms: `Microsoft User Data`, `Outlook files`, `Thumbs.db`, or `Thumbnails`\n * File names containing any of these characters \u2013 `*` (asterisk), `/` (forward slash), `\\` (back slash), `:` (colon), `<` (less than), `>` (greater than), `?` (question mark), `|` (vertical bar/pipe), `\"` (double quotes), or \\202E (character code 202E)\n\n## Quick Start\n\nThe following examples assume a MS Windows system, as the intend is to prepare a directory for AWS WorkDocs, which typically only has clients for Windows (unless you are on mobile).\n\n### From PyPi\n\nPrerequisites:\n\n* Python 3.7+\n\nThe example below will show how to get started very quickly using the most current version. The example will demonstrate a dry-run operation that will allow you to inspect the log file and review changes before committing.\n\nAssuming you are on the Windows command line:\n\n```shell\n> pip install py-workdocs-prep\n> cd \n> wdp --dry-run\n```\n\nA log file called `py_workdocs_prep.log` will be generated. If it already exist, new entries will be appended.\n\n**NOTE** It is highly recommended that you inspect the log file and understand how the application will change your files - and delete certain directories and files. Also take special note of any warnings, especially those about the total path length that may be too long (search for the string `TOTAL LENGTH EXCEEDED THRESHOLD`). [Read here](https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file?redirectedfrom=MSDN#maximum-path-length-limitation) why this is important.\n\nTo commit all changes, but first backup all files and directories, you can run the following (assuming the application is already installed):\n\n```shell\n> wdp -b\n```\n\n#### Command Line Arguments\n\n| Option | Description | Example |\n|--------------------|---------------------------------------------------------------------------------------------------------------------------------------|-------------------|\n| `-b` or `--backup` | Create a backup of all current files and directories. A `tar.gz` file will be created. | `> wdp -b` |\n| `--dry-run` | The application will not perform any file or directory modifications, but only log what would be done. | `> wdp --dry-run` |\n| `--delete-dirs` | Define a comma separated list of directories to be deleted. Don't include any spaces, but rather use proper Python regex expressions. | `> wdp --delete-dirs=\"test1,ven*,node_mod*\"`\n\n### From Source\n\nPrerequisites:\n\n* Python 3.7+\n* git\n\nAssuming your target directory is something like `D:\\Dropbox`, and you want to backup first, you can run the following commands:\n\n```bash\n> git clone https://github.com/nicc777/py_workdocs_prep.git\n> cd py_workdocs_prep\n> python setup.py sdist\n> pip install dist\\*\n> d:\n> cd Dropbox\n> wdp -b\n```\n\n## Strategy\n\nI had a very large number of files (600,000+) and it turned out a lot of them violated the mentioned restrictions. I had to make a plan...\n\nHere is how the script works:\n\n### Long path names\n\nThe Default Windows starting folder is `W:\\My Documents\\` and it contains 16 characters. \n\nTherefore, any other directory and/or file name combined in my Dropbox root folder had to come in under 244 characters.\n\nI decided that after the transformation, I would just print WARNINGS for each item with the number of characters over. I would then make a decision later on to either rename some part of the directory and/or file name or sometimes completely reorganize the directory structure. This would remain a manual operation.\n\n### Getting rid of redundant files\n\nAs I used Dropbox as a \"working\" documents directory I ended up with a large number `.git`, `venv` and `node_modules` directories (to name a view examples). So the obvious first step for me was to delete all these directories. (`DONE`)\n\nFiles that will also be deleted include files starting or ending with the tilde (`~`) character. (`PENDING`)\n\nFiles ending in `.tmp` will also be deleted. (`PENDING`)\n\n### Directory and file renaming strategy\n\nAny directory names and files containing any of the listed invalid characters (including any whitespace) will be renamed, replacing the violating characters with an underscore (`_`) character. Repeating underscore characters will be replaced with just a single underscore character.\n\n## Processing Methodology\n\nIn terms of processing, the following order of processing will be followed:\n\n1. First, all directories will be traversed and file names will be checked:\n 1. If it is identified as a file to be deleted, write out a delete command\n 2. Process illegal characters and issue a rename command if required\n2. Now traverse all directories and identify all directories to be renamed\n 1. After the list is determined: order the list in terms of length (from longest to least)\n 2. Loop through the list and commit rename commands\n3. Now, assuming we have a list of final directory and file names, determine which items are over the total length limit and print warnings for these\n\n## Acknowledgements\n\nThanks to [NanoDano](https://www.devdungeon.com/users/nanodano) for the [examples](https://www.devdungeon.com/content/walk-directory-python) I used to walk through the directories.\n\n## Geek Food\n\n### Manual Testing\n\nTo inspect the project and prepare for migrating to AWS Workdocs...\n\nClone the project and `cd` into the project directory\n\n```python\n>>> from py_workdocs_prep.py_workdocs_prep import start\n>>> start()\n```\n### Memory Profiling\n\nYou can try the following:\n\n```bash\n> pip install -U memory_profiler\n```\n\nThen:\n\n```python\n>>> from py_workdocs_prep.py_workdocs_prep import start\n>>> from memory_profiler import memory_usage\n>>> memory_usage((start, ('D:\\\\Dropbox',))) \nStarting in \"D:\\Dropbox\"\n[15.54296875, 15.54296875, 15.54296875,..., 178.421875]\n```\n\nThis means the script started scanning the directory `D:\\Dropbox` and the application grew from a starting 15.5 MiB to 178.4 MiB (early testing).\n\nMy machine has plenty of RAM, so this was acceptable for me.\n\nReferences:\n\n* [memory_profiler](https://pypi.org/project/memory-profiler/)", "description_content_type": "text/markdown", "docs_url": null, "download_url": "https://github.com/nicc777/py_workdocs_prep/releases/download/release-0.5.1/py_workdocs_prep-0.5.1.tar.gz", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/nicc777/py_workdocs_prep", "keywords": "aws workdocs", "license": "", "maintainer": "", "maintainer_email": "", "name": "py-workdocs-prep", "package_url": "https://pypi.org/project/py-workdocs-prep/", "platform": "", "project_url": "https://pypi.org/project/py-workdocs-prep/", "project_urls": { "Bug Reports": "https://github.com/nicc777/py_workdocs_prep/issues", "Download": "https://github.com/nicc777/py_workdocs_prep/releases/download/release-0.5.1/py_workdocs_prep-0.5.1.tar.gz", "Homepage": "https://github.com/nicc777/py_workdocs_prep", "Source": "https://github.com/nicc777/py_workdocs_prep" }, "release_url": "https://pypi.org/project/py-workdocs-prep/0.5.1/", "requires_dist": null, "requires_python": ">=3.6, <4", "summary": "AWS Workdocs Preparation Utility", "version": "0.5.1", "yanked": false, "yanked_reason": null }, "last_serial": 6111674, "releases": { "0.3.1": [ { "comment_text": "", "digests": { "md5": "424b5ee26e43cf0c2a274e8ec17c4e00", "sha256": "20da3d96fc3892815aaf18c5609c7c20231f920400dc108e8cf213e87ed1470c" }, "downloads": -1, "filename": "py_workdocs_prep-0.3.1.tar.gz", "has_sig": false, "md5_digest": "424b5ee26e43cf0c2a274e8ec17c4e00", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6, <4", "size": 7689, "upload_time": "2019-10-29T07:46:31", "upload_time_iso_8601": "2019-10-29T07:46:31.095403Z", "url": "https://files.pythonhosted.org/packages/86/9e/118a2098cbeda96a4c2ec18e62c63bd97b725a2198c362d8791d15e3f6aa/py_workdocs_prep-0.3.1.tar.gz", "yanked": false, "yanked_reason": null } ], "0.4.0": [ { "comment_text": "", "digests": { "md5": "5dc02c86d846de8610bd1c6b9e878eff", "sha256": "7587cace657f74ceb7939f3927192fd5a046fcb83a843b63fd5c26c48edfda12" }, "downloads": -1, "filename": "py_workdocs_prep-0.4.0.tar.gz", "has_sig": false, "md5_digest": "5dc02c86d846de8610bd1c6b9e878eff", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6, <4", "size": 8436, "upload_time": "2019-11-01T04:44:25", "upload_time_iso_8601": "2019-11-01T04:44:25.911933Z", "url": "https://files.pythonhosted.org/packages/2e/cb/d1c832606a10a1c62ba006a15dad29205104444a5281ad5243f42cf0437d/py_workdocs_prep-0.4.0.tar.gz", "yanked": false, "yanked_reason": null } ], "0.5.0": [ { "comment_text": "", "digests": { "md5": "6e117de5752a6e8b191e0028eab48631", "sha256": "b74a450c43ffd48b58fd25669c06cd9d28f10a7212ed2b643593fdb1fddad195" }, "downloads": -1, "filename": "py_workdocs_prep-0.5.0.tar.gz", "has_sig": false, "md5_digest": "6e117de5752a6e8b191e0028eab48631", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6, <4", "size": 9886, "upload_time": "2019-11-10T13:14:21", "upload_time_iso_8601": "2019-11-10T13:14:21.466844Z", "url": "https://files.pythonhosted.org/packages/93/7c/be61d8ecd3fa7d83154a05535d74e41084e35f1c0adf8e61e55b66f745c4/py_workdocs_prep-0.5.0.tar.gz", "yanked": false, "yanked_reason": null } ], "0.5.1": [ { "comment_text": "", "digests": { "md5": "51f1501f5ff8ffa0cd7ec861974e03b5", "sha256": "2db4beafced5f795c4a3bc7bbf183ffba908877d0b5754afb9665d5ca0c9e365" }, "downloads": -1, "filename": "py_workdocs_prep-0.5.1.tar.gz", "has_sig": false, "md5_digest": "51f1501f5ff8ffa0cd7ec861974e03b5", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6, <4", "size": 10784, "upload_time": "2019-11-10T13:42:09", "upload_time_iso_8601": "2019-11-10T13:42:09.275472Z", "url": "https://files.pythonhosted.org/packages/75/91/bf424211c832b9ebe72757398630612fbee1026f7d3ed39cd3f6e13c125a/py_workdocs_prep-0.5.1.tar.gz", "yanked": false, "yanked_reason": null } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "51f1501f5ff8ffa0cd7ec861974e03b5", "sha256": "2db4beafced5f795c4a3bc7bbf183ffba908877d0b5754afb9665d5ca0c9e365" }, "downloads": -1, "filename": "py_workdocs_prep-0.5.1.tar.gz", "has_sig": false, "md5_digest": "51f1501f5ff8ffa0cd7ec861974e03b5", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.6, <4", "size": 10784, "upload_time": "2019-11-10T13:42:09", "upload_time_iso_8601": "2019-11-10T13:42:09.275472Z", "url": "https://files.pythonhosted.org/packages/75/91/bf424211c832b9ebe72757398630612fbee1026f7d3ed39cd3f6e13c125a/py_workdocs_prep-0.5.1.tar.gz", "yanked": false, "yanked_reason": null } ], "vulnerabilities": [] }