{ "info": { "author": "Victor Alejandro Gil Sepulveda", "author_email": "victor.gil.sepulveda@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "pyScheduler\n===========\n\npyScheduler is very simple pure Python implementation of a task\nscheduler.\n\nWhat is a scheduler?\n--------------------\n\nA task scheduler is a controller that executes tasks in a given order.\nIn this case tasks are executed so that all tasks are executed before\nany of its dependencies.\n\nWhy may I use a scheduler?\n--------------------------\n\nOk, at this point you might be thinking what's the point on building\nsuch a contraption if you only want to execute some tasks. Just put them\non list, sort them so that no task is executed before its dependencies,\nand go ahead! Well, this is indeed the idea behind a serial scheduler.\nIt does not look very useful indeed... However things get more exciting\nonce you realize that you can get some profit from the inherent\nparallelism of your dependency description. This means that you are not\nlimited to execute one task at a time, but that you can execute all\ntasks without dependencies at the same time.\n\nGoing parallel\n--------------\n\npyScheduler module offers 3 different scheduler types: - SerialScheduler\n- Which is (obviously) not parallel. - ProcessParallelScheduler - Which\nuses multiprocessing module to run tasks in parallel. -\nMPIParallelScheduler - Which uses the great\n`mpi4py `_ module to run the tasks in\nparallel.\n\nUse ProcessParallelScheduler if you plan to use the parallelism of your\nlocal machine. MPIParallelScheduler is better if you want to use your\nscheduler in a cluster. When using an MPI scheduler, remember that you\nmust run your Python script using *mpirun* (-np number\\_of\\_processes)\nor it will not work (Process-based scheduler allows the specification of\nthe number of processes in the constructor).\n\nVery simple example (from the test package)\n-------------------------------------------\n\nImagine we want to execute 10 tasks which dependency graph is\nrepresented in the drawing below:\n\nFirst step would be to create a scheduler and define the tasks:\n``python scheduler = SerialScheduler() scheduler.add_task(task_name = \"1\", dependencies = [\"2\",\"3\"], description =\"\",target_function = test_function ,function_kwargs={\"this\":1}) scheduler.add_task(task_name = \"2\", dependencies = [\"4\"], description =\"\",target_function = test_function ,function_kwargs={\"this\":2}) scheduler.add_task(task_name = \"3\", dependencies = [\"5\",\"6\"], description =\"\",target_function = test_function ,function_kwargs={\"this\":3}) scheduler.add_task(task_name = \"4\", dependencies = [], description =\"\",target_function = test_function ,function_kwargs={\"this\":4}) scheduler.add_task(task_name = \"5\", dependencies = [], description =\"\",target_function = test_function ,function_kwargs={\"this\":5}) scheduler.add_task(task_name = \"6\", dependencies = [], description =\"\",target_function = test_function ,function_kwargs={\"this\":6}) scheduler.add_task(task_name = \"7\", dependencies = [], description =\"\",target_function = test_function ,function_kwargs={\"this\":7}) scheduler.add_task(task_name = \"8\", dependencies = [\"9\"], description =\"\",target_function = test_function ,function_kwargs={\"this\":8}) scheduler.add_task(task_name = \"9\", dependencies = [], description =\"\",target_function = test_function ,function_kwargs={\"this\":9})``\nThe code above shows how to create a SerialScheduler (the other two are\ncreated more or less in the same way) and adding the tasks. Each task is\ndefined by its identifier (which has to be unique), a description (that\ncan be empty) and a callable and the arguments we want to pass to this\ncallable when is executed. Finally, the method also accepts a\ndependencies array. This array contains the ids of the taks the task we\nare defining depends on.\n\nThe last step would be just to run the scheduler:\n``python scheduler.run()`` Taks will run in this order (7,9 and 8 could\nbe run before the others though):\n``4 -> 2 -> 5 -> 6 -> 3 -> 1 -> 7 -> 9 -> 8`` If you want to parallelize\nthe execution of your tasks, just change the first line by something\nlike: ``parallel = ProcessParallelScheduler(4)`` And your computer will\nuse 4 processes to complete the job (as one is used as a control node,\nonly 3 processes will contribute to the work). In this case this is one\nof the possible run traces you can get:\n``4 -> 2 -> 8 -> 9 5 -> 3 6 -> 7`` Or even:\n``4 -> 2 -> 7 5 -> 3 -> 9 6 -> 8`` which is better balanced.\n\nWhere is the documentation?\n---------------------------\n\nModules are profusely documented. Give the code a chance!\n\nNext moves\n----------\n\nThings yet to do include a mechanism to allow tasks to work with the\nresults of their direct or indirect dependencies and some sort of\nchecker to warn about (and avoid) the definition of dependency cycles.\nAdding a little more logic to ( and refactoring) the 'next task\nselector' to improve load balance (see the example above for an example\nof how selection of the tasks can affect balance). So: - [ ] Cyclic\ndependency checks - [ ] Task result sharing machinery - [ ] Smarter load\nbalancing", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/victor-gil-sepulveda/pyScheduler.git", "keywords": "task scheduling", "license": "LICENSE.txt", "maintainer": "", "maintainer_email": "", "name": "pyScheduler", "package_url": "https://pypi.org/project/pyScheduler/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/pyScheduler/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://github.com/victor-gil-sepulveda/pyScheduler.git" }, "release_url": "https://pypi.org/project/pyScheduler/0.1.0/", "requires_dist": null, "requires_python": null, "summary": "Compendium of 3 different task schedulers with dependency managing.", "version": "0.1.0" }, "last_serial": 933950, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "94501e9f60105a5b79341e9c22406666", "sha256": "358a3b20ae28e16350414ae7cbb7ecf8155e42c2754a05c96ddd93d45f01cdb6" }, "downloads": -1, "filename": "pyScheduler-0.1.0.tar.gz", "has_sig": false, "md5_digest": "94501e9f60105a5b79341e9c22406666", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9921, "upload_time": "2013-12-02T16:08:46", "url": "https://files.pythonhosted.org/packages/ac/a1/b0d105423c1258e112d32c6dbe70d65ca555f6c2f2804ac1e94fa1174afe/pyScheduler-0.1.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "94501e9f60105a5b79341e9c22406666", "sha256": "358a3b20ae28e16350414ae7cbb7ecf8155e42c2754a05c96ddd93d45f01cdb6" }, "downloads": -1, "filename": "pyScheduler-0.1.0.tar.gz", "has_sig": false, "md5_digest": "94501e9f60105a5b79341e9c22406666", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9921, "upload_time": "2013-12-02T16:08:46", "url": "https://files.pythonhosted.org/packages/ac/a1/b0d105423c1258e112d32c6dbe70d65ca555f6c2f2804ac1e94fa1174afe/pyScheduler-0.1.0.tar.gz" } ] }