{ "info": { "author": "Russell Power", "author_email": "power@cs.nyu.edu", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Developers", "Intended Audience :: System Administrators", "License :: OSI Approved :: BSD License", "Operating System :: POSIX", "Programming Language :: Python :: 2.5", "Programming Language :: Python :: 2.6", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.0", "Programming Language :: Python :: 3.1", "Programming Language :: Python :: 3.2", "Topic :: Software Development :: Libraries", "Topic :: System :: Clustering", "Topic :: System :: Distributed Computing" ], "description": "MyCloud\n=======\n\nLeverage small clusters of machines to increase your productivity.\n\nMyCloud requires no prior setup; if you can SSH to your machines, then\nit will work out of the box. MyCloud currently exports a simple\nmapreduce API with several common input formats; adding support for your\nown is easy as well.\n\nUsage\n=====\n\nStarting your cluster:\n\n::\n\n import mycloud\n\n cluster = mycloud.Cluster(['machine1', 'machine2'])\n\n # or use defaults from ~/.config/mycloud\n # cluster = mycloud.Cluster()\n\nMap over a list:\n\n::\n\n result = cluster.map(compute_factors, range(1000))\n\nClientFS makes accessing local files seamless!\n\n::\n\n def my_worker(filename):\n do_work(mycloud.fs.FS.open(filename, 'r'))\n\n cluster.map(['client:///my/local/file'], my_worker)\n\nUse the MapReduce interface to easily handle processing of larger\ndatasets:\n\n::\n\n from mycloud.mapreduce import MapReduce, group\n from mycloud.resource import CSV \n input_desc = [CSV('client:///path/to/my_input_%d.csv') % i for i in range(100)]\n output_desc = [CSV('client:///path/to/my_output_file.csv')]\n\n def map_identity(kv_iter, output):\n for k, v in kv_iter:\n output(k, int(v[0]))\n\n def reduce_sum(kv_iter, output): \n for k, values in group(kv_iter):\n output(k, sum(values))\n\n mr = MapReduce(cluster, map_identity, reduce_sum, input_desc, output_desc)\n\n result = mr.run()\n\n for k, v in result[0].reader():\n print k, v\n\nPerformance\n===========\n\nIt is, keep in mind, written entirely in Python.\n\nSome simple operations I've used it for (6 machines, 96 cores):\n\n- Sorting a billion numbers: ~5m\n- Preprocessing 1.3 million images (resizing and SIFT feature\n extraction): ~1 hour\n\nInput formats\n=============\n\nMycloud has builtin support for processing the following file types:\n\n- LevelDB\n- CSV\n- Text (lines)\n- Zip\n\nAdding support for your own is simple - just write a resource class\ndescribing how to get a reader and writer. (see resource.py for\ndetails).\n\nWhy?!?\n======\n\nSometimes you're developing something in Python (because that's what you\ndo), and you decide you'd like it to be parallelized. Our current\noptions are multiprocessing (limiting us to a single machine) and Hadoop\nstreaming (limiting us to strings and Hadoop's input formats).\n\nAlso, because I could.\n\nCredits\n=======\n\nMyCloud builds on the phenomonally useful\n`cloud `_ serialization,\n`SSH/Paramiko `_, and\n`LevelDB `_ libraries.", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "http://github.com/rjpower/mycloud", "keywords": null, "license": "BSD", "maintainer": null, "maintainer_email": null, "name": "mycloud", "package_url": "https://pypi.org/project/mycloud/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/mycloud/", "project_urls": { "Download": "UNKNOWN", "Homepage": "http://github.com/rjpower/mycloud" }, "release_url": "https://pypi.org/project/mycloud/0.51/", "requires_dist": null, "requires_python": null, "summary": "Work distribution for small clusters.", "version": "0.51" }, "last_serial": 795140, "releases": { "0.48": [ { "comment_text": "", "digests": { "md5": "8f08de28d4589083ed444c4791e29766", "sha256": "95769a952cc764b4dbfc6e29f4457255bcb4f3657c82664a37e11ed7d0c51b74" }, "downloads": -1, "filename": "mycloud-0.48.tar.gz", "has_sig": false, "md5_digest": "8f08de28d4589083ed444c4791e29766", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11965, "upload_time": "2012-11-16T22:10:52", "url": "https://files.pythonhosted.org/packages/f4/dc/dba714db7a221524f8607d343711add6fae2718a85d43aaf5e4aab745b08/mycloud-0.48.tar.gz" } ], "0.49": [ { "comment_text": "", "digests": { "md5": "a9a274c6662d4e9b46ebf11b423d3da2", "sha256": "11e76d45df445da82fd7d035e400765ce92f776cb6a43500f0384c86e75e93ce" }, "downloads": -1, "filename": "mycloud-0.49.tar.gz", "has_sig": false, "md5_digest": "a9a274c6662d4e9b46ebf11b423d3da2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 11864, "upload_time": "2013-01-01T22:45:45", "url": "https://files.pythonhosted.org/packages/0f/31/ee241621babf75d33414855b9ea469b43febe1da1c4996d1a96985b1463c/mycloud-0.49.tar.gz" } ], "0.51": [ { "comment_text": "", "digests": { "md5": "16cdbe7c0c8c03c32e4a6f1ad0a93870", "sha256": "397326a15bef6b4f80d689939c44018eb0444807ee9b95cf46b48a01100d5f25" }, "downloads": -1, "filename": "mycloud-0.51.tar.gz", "has_sig": false, "md5_digest": "16cdbe7c0c8c03c32e4a6f1ad0a93870", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12009, "upload_time": "2013-01-15T22:33:25", "url": "https://files.pythonhosted.org/packages/9b/74/101e9410700f5c252e0c21ec28ae9fa65fabdf2dd8722f1cf0acc508a0a5/mycloud-0.51.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "16cdbe7c0c8c03c32e4a6f1ad0a93870", "sha256": "397326a15bef6b4f80d689939c44018eb0444807ee9b95cf46b48a01100d5f25" }, "downloads": -1, "filename": "mycloud-0.51.tar.gz", "has_sig": false, "md5_digest": "16cdbe7c0c8c03c32e4a6f1ad0a93870", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 12009, "upload_time": "2013-01-15T22:33:25", "url": "https://files.pythonhosted.org/packages/9b/74/101e9410700f5c252e0c21ec28ae9fa65fabdf2dd8722f1cf0acc508a0a5/mycloud-0.51.tar.gz" } ] }