{ "info": { "author": "Efi Merdler-Kravitz", "author_email": "efi.merdler@gmail.com", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "## Basic Usage\n\n### Installation\n\n`pip install s3-scheduler`\n\n### Setting up a recurring flow\n\nThe library uses AWS builtin capability to run every 1 minute. The configuration depends on your framework. For example for Zappa use\n\n`s3_events.py`\n```python\n\nscheduler = Scheduler(...)\ndef check_scheduled_events():\n scheduler.handle()\n```\n\n`zappa_settings.json`\n```json\n{\n \"production\": {\n \"events\": [\n {\n \"function\": \"utils.s3_events.check_scheduled_events\",\n \"expression\": \"rate(1 minute)\"\n }\n ]\n }\n}\n```\n\n### Scheduling\n\n```\nimport boto3\nfrom s3_scheduler.scheduler import Scheduler\nfrom s3_scheduler.utils import nowut\ns3_resource = boto3.resource(\"s3\")\ns3_client = boto3.client(\"s3\")\nscheduler = Scheduler(\"bucket\", \"/path\", s3_resource, s3_client)\ntime = nowut() + timedelta(minutes=10)\nupload_to = scheduler.schedule(time, \"s3-bucket\", \"s3_files-important\", \"content\")\n```\nDuring initialization the scheduler requires the bucket and folder in which to keep the actual scheduling details. Remember, each event is a separate file, therefore there is a need to save them somewhere. When to schedule is a simple `datetime` object.\n\n### Stopping\n\n```python\nscheduler = Scheduler(\"bucket\", \"/path\", s3_resource, s3_client)\ntime = nowut() + timedelta(minutes=10)\nkey = scheduler.schedule(time, \"s3-bucket\", \"s3_files-important\", \"content\")\nscheduler.stop(key)\n```\n\nIn case you want to cancel the schedule event before it occurs\n\n# Using S3 as a scheduler\n\nS3 is a powerful tool and it can be used for more than elastic persistent layer. You can read more about it on [hackernoon.com](https://hackernoon.com/s3-the-best-of-2-worlds-92576f23c000)\n\nIn the following post I\u2019m going to demonstrate how to use S3 as a scheduling mechanism to execute various tasks.\n\n## Overview\n\n![Simple S3 flow](https://cdn-images-1.medium.com/max/2000/1*8_iclxSZ_B6M--uGXkNp0Q.png)\n\nS3 alongside a Lambda function creates a simple event base flow, e.g. attach a Lambda to S3 PUT event, create a new file and the Lambda function is called. In order to create a schedule event all you have to do is to write the file you want to act upon on the designated time, however AWS only enables you to create recurring [events using cron or rate expression](https://docs.aws.amazon.com/lambda/latest/dg/tutorial-scheduled-events-schedule-expressions.html), what happens when you want to schedule a one time event? You are stuck.\n\nThe S3-Scheduler library enables you to do just that, it uses S3 as a scheduling mechanism that enables you to schedule one time event. \n\n### How it works\n\n![](https://cdn-images-1.medium.com/max/2000/1*9ZApM13Gq9OyobtKmdlZkg.png)\n\nEach event is a separate file, behind the scenes the library uses the recurring mechanism to wake up every 1 minute, scan for the relevant files using S3\u2019s [filter capabilities](https://boto3.readthedocs.io/en/latest/guide/collections.html) and if the scheduled time had passed move the file to the relevant bucket + key.\n\nThe library, in order to function properly has to know the answer to three questions:\n\n1. The content to save.\n2. Where to save it (bucket + key) \u2192 will trigger the appropriate Lambda function.\n3. When to move it to the appropriate bucket.\n\n### Encoding details\n\nThe content to save is left unchanged, points 2 and 3 mentioned above are encoded in the key\u2019s name and use | as a separator between the parts, for example to copy the relevant content on the 5th of August to a bucket called s3-bucket and a folder named s3_important_files the scheduler will produce the following file `2018\u201308\u201305|s3-bucket|s3_files-important` . By keeping the meta data outside the actual content we achieve couple of benefits:\n\n* Speed up the process, no need to read the entire content in order to decide when and where to copy. \n* It allows the content to be binary, not only text based.\n* By using S3 filter capabilities it reduces the cost to fetch the correct files.\n* Easier debugging, just view the file name in order to understand when and where to copy.\n\n## Fin\n\nScheduling in the AWS serverless world is a bit tricky, right now AWS only provides CRON like capabilities, this post demonstrated a technique that can be used to create a more robust scheduling capability.", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/efi-mk/s3-scheduler", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "s3-scheduler", "package_url": "https://pypi.org/project/s3-scheduler/", "platform": "", "project_url": "https://pypi.org/project/s3-scheduler/", "project_urls": { "Homepage": "https://github.com/efi-mk/s3-scheduler" }, "release_url": "https://pypi.org/project/s3-scheduler/0.0.3/", "requires_dist": null, "requires_python": "", "summary": "Use S3 as a scheduler mechanism using Lambda", "version": "0.0.3" }, "last_serial": 4139953, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "3bfc7fd88750053e2c4d271e9ce397f7", "sha256": "55bb4722826b068ac2362970199d426adc5aa12d3bbce63610ca618774070bc5" }, "downloads": -1, "filename": "s3_scheduler-0.0.1.tar.gz", "has_sig": false, "md5_digest": "3bfc7fd88750053e2c4d271e9ce397f7", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5748, "upload_time": "2018-08-06T10:22:42", "url": "https://files.pythonhosted.org/packages/e6/c7/4dcf8f4269ff47228966d40b717eb9750a1574de6348137ede7c75bd67bd/s3_scheduler-0.0.1.tar.gz" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "af666c4fdd4a56a15cef3ca50fce6a68", "sha256": "a85922821a22769f98f6b6457524b1c83d89dae374094df22e1fa2d5292e3898" }, "downloads": -1, "filename": "s3_scheduler-0.0.2.tar.gz", "has_sig": false, "md5_digest": "af666c4fdd4a56a15cef3ca50fce6a68", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5923, "upload_time": "2018-08-06T10:40:34", "url": "https://files.pythonhosted.org/packages/d5/78/a233e16e21e7b22a1e2626aba939504533cf7f3263688415049a67aa26f5/s3_scheduler-0.0.2.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "d45f26994eaebaea9cb81ecb01137fd2", "sha256": "74a1b24b90703f8893df1e003e97006a0b16b9aaa0ad1a831657cda0c199d79d" }, "downloads": -1, "filename": "s3_scheduler-0.0.3.tar.gz", "has_sig": false, "md5_digest": "d45f26994eaebaea9cb81ecb01137fd2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5832, "upload_time": "2018-08-06T11:25:04", "url": "https://files.pythonhosted.org/packages/aa/77/9b821e609c9a75bbcadf0a81a59e686db30c406656cfb7d43addfbcac9fb/s3_scheduler-0.0.3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "d45f26994eaebaea9cb81ecb01137fd2", "sha256": "74a1b24b90703f8893df1e003e97006a0b16b9aaa0ad1a831657cda0c199d79d" }, "downloads": -1, "filename": "s3_scheduler-0.0.3.tar.gz", "has_sig": false, "md5_digest": "d45f26994eaebaea9cb81ecb01137fd2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 5832, "upload_time": "2018-08-06T11:25:04", "url": "https://files.pythonhosted.org/packages/aa/77/9b821e609c9a75bbcadf0a81a59e686db30c406656cfb7d43addfbcac9fb/s3_scheduler-0.0.3.tar.gz" } ] }