{ "info": { "author": "", "author_email": "", "bugtrack_url": null, "classifiers": [], "description": "Scyllabackup\n============\n\n## Overview\n\n**Scyllabackup** is a backup management tool for scylladb. It allows taking\nsnapshots of scylladb and upload the resulting snapshot to cloud storage. The\nintended usage is to use this tool via cronjob which takes regular snapshots and\nuploads them to highly available cloud storage services. It also implements some\nworkflows to automate the restore of scylladb. All this is packaged in a simple\ncli wrapped around a modular codebase.\n\nIt supports uploading snapshots to cloud storage via\n[spongeblob](https://github.com/helpshift/spongeblob.py) library, thus\nsupporting AWS Simple Storage Service (*S3*) and Windows Azure Blob Storage\n(*WABS*) services. It wraps nodetool and cqlsh utility to take backup with a\nsingle call. It stores the metadata in a local sqlite file which is also\nuploaded to remote server with each upload for disaster recovery.\n\n\n## Installation\n\nYou can install scyllabackup via pip\n\n```\npip install scyllabackup\n```\n\n## Dependencies\n\nScyllabackup manages cloud storage via `spongeblob` python module which\nimplements a common interface for both S3 and WABS.\n\n## Usage\n\nScyllabackup implements its various operations as subcommands. Following\nsubcommands are available.\n\n\n### take\n\nTo take backup run:\n\n```\nscyllabackup take -c /etc/scyllabackup.yml\n```\n\nThis will create a snapshot and upload it to cloud storage as defined by\nparameters in /etc/scyllabackup.yml. It also takes a schema dump of scylladb and\nstores it in sqlite database. The schema backup is required when restoring.\n\nSnapshots in scyllabackup are referenced by unix `epoch` when the snapshot was\ntaken. All backup metadata is stored against this `epoch` value, and the `epoch`\nis required for other operations like download/restore of snapshots and deleting\nold snapshots.\n\nThis command should be run regularly via a cronjob at same time across an entire\nscylladb cluster for consistent backups.\n\n\n### list\n\nThis command lists down all available snapshots\n\n```\nscyllabackup list -c /etc/scyllabackup.yml\n```\n\nAdditionally, if you want a snapshot before a specific timestamp in unix epoch\ninteger, use the following variant. This lists the last 5 snapshots before the\nspecified timestamp:\n\n```\nscyllabackup list --latest-before -c /etc/scyllabackup.yml\n```\n\n### download\\_db\n\nWhen restoring a cluster, you will first require to download the sqlite metadata\nfile. In case of a disaster, this file will be available in cloud storage. To\nfetch this, run the following command:\n\n```\nscyllabackup download_db -c /etc/scyllabackup.yml \n```\n\nThe db file will be downloaded to `download_db_path` location. Please note that\nthe `db` parameter in /etc/scyllabackup.yml should be same as the db filename\nused when taking backups. This is because the db file uploaded in cloud storage\nhas same basename as the file present on filesystem and this is configurable.\n\n### download\n\nOnce the scyllabackup backup metadata db is downloaded, you can download a snapshot using\nfollowing command:\n\n```\nscyllabackup download -c /etc/scyllabackup.yml --snapshot --download-dir --schema \n```\n\nThis will download the snapshot referenced by `epoch` in the `download_dir`\ndirectory. The schema file is also generated in the path specified by\n`schema_file`. Please ensure that `download_dir` this is not the same directory\nas the data directory for scylladb node where you want to restore. As it may\nresult in overwrite of files which may be unintended. This command can also take\n`--keyspace` parameter if you want to download snapshotted sstable files for a\nspecific keyspace.\n\nThere is a variant of this command which takes a `--latest-before` parameter\ninstead of `--snapshot` parameter. These flags are mutually exclusive, and will\nresult in error if used together. When `--latest-before` parameter is used,\ninstead of downloading the exact snapshot, the download command will download\nthe latest snapshot before the specified timestamp. This function is useful for\nbuilding automation for restoring multiple nodes in a scylladb cluster to a\nsnapshot which was taken around same time.\n\n### verify\n\nDownloading snapshot is an expensive operations and sometimes you may find that\nsome files are missing (although unlikely). This command is implemented for a\nsanity check that all files for a snapshot are existing, there is a verify\ncommand, which verifies that all files for a snapshot in metadata db are present\non the cloud storage, without downloading them.\n\n```\nscyllabackup verify -c /etc/scyllabackup.yml \n```\n\nThis verifies that all files for snapshot referenced by `snapshot_epoch` are present.\n\n### restore\\_schema\n\nThe schema file generated via the `download` subcommand can be restored via this\none. It wraps cqlsh to read the schema file and restore it. You may also use\ncqlsh directly. This command is implemented to keep the interface uniform. Note\nthat schema should be restored before attempting to restore the data.\n\n```\nscyllabackup restore_schema -c /etc/scyllabackup.yml \n```\n\nThis will restore schema in scylladb by running DDL statements in `schema_file`.\nThis command asks for a manual confirmation, which can be skipped with `-f`\nflag.\n\n### get\\_restore\\_mapping\n\nThe table directories in scylladb have a UUID in their name which is generated\nwhen the tables are created. This UUID will be different from in a fresh\nclustered where schema is restored from the UUID of tables in a snapshot.\n\nIt is cumbersome to manually figure this out and manually restore the backup\nfiles. This is a helper command to automate restore. It generates a json file\nwhich has a mapping of table directories between the downloaded snapshot and\nschema restored fresh cluster. The scylladb service should be up to generate\nthis mapping.\n\n```\nscyllabackup get_restore_mapping -c /etc/scyllabackup.yml --restore-path --keyspace \n```\n\nWhere `download_dir` is the directory where a snapshot files are downloaded via\n`download` subcommand, `keyspace` is the keyspace you want to restore and\n`restore_mapping_file` is the file where the json mapping will be written to.\n\n### restore\n\nOnce the restore_mapping is available, you can restore the files from snapshot\nto scylladb. Ensure that scylladb is stopped before restoring the files. There\nis a basic sanity check would warn if scylladb is still up, but you should\ndouble check that. The restore can be performed using following command:\n\n```\nscyllabackup restore -c /etc/scyllabackup.yml --restore-path --restore-mapping-file \n```\n\n`download_dir` is the directory where the snapshot was downloaded via `download`\nsubcommand. `restore_mapping_file` is the file which was generated via\n`get_restore_mapping` subcommand. This will move all sstable files from download\ndir to scylladb data dir. Please note that `restore_mapping_file` has mapping\nfor only one keyspace, you may need to run this command multiple times for other\nkeyspace.\n\n### delete\\_older\\_than\n\nThis command delete snapshots older than specified number of days from cloud\nstorage. The snapshot references are also dropped from metadata db, thereby\nperforming a cleanup and keep the sqlite db size consistent.\n\n```\nscyllabackup delete_older_than -c /etc/scyllabackup.yml \n```\n\nUsually, it's a good practice to delete snapshots older than 30 days for various\ncompliance (GDPR etc.).\n\n## Common Options\n\nThere are some common options for all scyllabackup subcommands which can either\nbe passed as cli args or can be loaded from a config file for brevity. Some of\nthese options have sane default value and may not require to be passed. The\nimportant parameters are documented in this section. A documentation of options\ncan be retrieved via `-h` for a subcommand\n\n```\nusage: scyllabackup take [-h] [-c CONF_FILE]\n [-l {DEBUG,INFO,WARNING,ERROR,CRITICAL}] --path PATH\n --db DB --provider {s3,wabs}\n [--nodetool-path NODETOOL_PATH]\n [--cqlsh-path CQLSH_PATH] [--cqlsh-host CQLSH_HOST]\n [--cqlsh-port CQLSH_PORT]\n [--s3-bucket-name BUCKET_NAME] [--s3-aws-key AWS_KEY]\n [--s3-aws-secret AWS_SECRET]\n [--wabs-container-name WABS_CONTAINER_NAME]\n [--wabs-account-name WABS_ACCOUNT_NAME]\n [--wabs-sas-token WABS_SAS_TOKEN] --prefix PREFIX\n [--lock LOCK] [--lock-timeout LOCK_TIMEOUT]\n [--max-workers MAX_WORKERS]\n\nArgs that start with '--' (eg. -l) can also be set in a config file (specified\nvia -c). Config file syntax allows: key=value, flag=true, stuff=[a,b,c] (for\ndetails, see syntax at https://goo.gl/R74nmi). If an arg is specified in more\nthan one place, then commandline values override config file values which\noverride defaults.\n\noptional arguments:\n -h, --help show this help message and exit\n -c CONF_FILE, --conf-file CONF_FILE\n Config file for scyllabackup (default: None)\n -l {DEBUG,INFO,WARNING,ERROR,CRITICAL}, --log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}\n Log level for scyllabackup (default: WARNING)\n --path PATH Path of scylla data directory (default: None)\n --db DB Path of scyllabackup db file. The backup metadata is\n stored in this file. (default: None)\n --provider {s3,wabs} Cloud provider used for storage. It should be one of\n `s3` or `wabs` (default: None)\n --nodetool-path NODETOOL_PATH\n Path of nodetool utility on filesystem. (default:\n /usr/bin/nodetool)\n --cqlsh-path CQLSH_PATH\n Path of cqlsh utility on filesystem (default:\n /usr/bin/cqlsh)\n --cqlsh-host CQLSH_HOST\n Host to use for connecting cqlsh service (default:\n 127.0.0.1)\n --cqlsh-port CQLSH_PORT\n Port to use for connecting cqlsh service (default:\n 9042)\n --prefix PREFIX Mandatory prefix to store backups in cloud storage\n (default: None)\n --lock LOCK Lock file for scyllabackup. (default:\n /var/run/lock/scyllabackup.lock)\n --lock-timeout LOCK_TIMEOUT\n Timeout for taking lock. (default: 10)\n --max-workers MAX_WORKERS\n Sets max workers for parallelizing storage api calls\n (default: 4)\n\nRequired arguments if using 's3' provider:\n --s3-bucket-name BUCKET_NAME\n Mandatory if provider is s3 (default: None)\n --s3-aws-key AWS_KEY Mandatory if provider is s3 (default: None)\n --s3-aws-secret AWS_SECRET\n Mandatory if provider is s3 (default: None)\n\nRequired arguments if using 'wabs' provider:\n --wabs-container-name WABS_CONTAINER_NAME\n Mandatory if provider is wabs (default: None)\n --wabs-account-name WABS_ACCOUNT_NAME\n Mandatory if provider is wabs (default: None)\n --wabs-sas-token WABS_SAS_TOKEN\n Mandatory if provider is wabs (default: None)\n```\n\n## TODOs\n- [ ] Support PITR with commitlogs (requires commitlog archiving support from scylladb)\n- [ ] Some common cli opts are not required in all subcommands, require fixing.\n- [ ] Do not rely on db for metadata.\n - [ ] Upload schema in s3 as a file. Store in db as a record.\n - [ ] Upload snapshot file list in s3 in a timestamp\n - [ ] Cleanup function should take care of these files when removing data\n\n\n", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/helpshift/scyllabackup", "keywords": "", "license": "MIT License", "maintainer": "", "maintainer_email": "", "name": "scyllabackup", "package_url": "https://pypi.org/project/scyllabackup/", "platform": "", "project_url": "https://pypi.org/project/scyllabackup/", "project_urls": { "Homepage": "https://github.com/helpshift/scyllabackup" }, "release_url": "https://pypi.org/project/scyllabackup/0.1.0/", "requires_dist": [ "spongeblob (==0.1.1)", "tenacity (==4.10.0)", "sh (==1.12.14)", "gevent (==1.2.2)", "ConfigArgParse (==0.13.0)", "filelock (==3.0.4)" ], "requires_python": "", "summary": "Scyllabackup: A tool for taking scylla backups", "version": "0.1.0" }, "last_serial": 4269012, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "ac361fcb8cb958f4b38f0f5435b36794", "sha256": "dd663de3ac7ec6cc6f9ae24934ea82530b33df5087062c80b7fce607b5976fb6" }, "downloads": -1, "filename": "scyllabackup-0.1.0-py2-none-any.whl", "has_sig": false, "md5_digest": "ac361fcb8cb958f4b38f0f5435b36794", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 19049, "upload_time": "2018-09-13T14:51:50", "url": "https://files.pythonhosted.org/packages/1c/56/3fe7848817939eee09a37ab685235e21fee8b8cf5fdf62a7dc1038a6dcb9/scyllabackup-0.1.0-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "9fb72df4e6a681c141523957706d8562", "sha256": "f162248e6034f1f11fb12b087902a251b6b8a7fb05671143f63b3a70b3f5b34a" }, "downloads": -1, "filename": "scyllabackup-0.1.0.tar.gz", "has_sig": false, "md5_digest": "9fb72df4e6a681c141523957706d8562", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20594, "upload_time": "2018-09-13T14:51:52", "url": "https://files.pythonhosted.org/packages/7e/49/411796828128fe718c44d06713ab3c40c46ad4da7dda5109f1556180a039/scyllabackup-0.1.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "ac361fcb8cb958f4b38f0f5435b36794", "sha256": "dd663de3ac7ec6cc6f9ae24934ea82530b33df5087062c80b7fce607b5976fb6" }, "downloads": -1, "filename": "scyllabackup-0.1.0-py2-none-any.whl", "has_sig": false, "md5_digest": "ac361fcb8cb958f4b38f0f5435b36794", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 19049, "upload_time": "2018-09-13T14:51:50", "url": "https://files.pythonhosted.org/packages/1c/56/3fe7848817939eee09a37ab685235e21fee8b8cf5fdf62a7dc1038a6dcb9/scyllabackup-0.1.0-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "9fb72df4e6a681c141523957706d8562", "sha256": "f162248e6034f1f11fb12b087902a251b6b8a7fb05671143f63b3a70b3f5b34a" }, "downloads": -1, "filename": "scyllabackup-0.1.0.tar.gz", "has_sig": false, "md5_digest": "9fb72df4e6a681c141523957706d8562", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 20594, "upload_time": "2018-09-13T14:51:52", "url": "https://files.pythonhosted.org/packages/7e/49/411796828128fe718c44d06713ab3c40c46ad4da7dda5109f1556180a039/scyllabackup-0.1.0.tar.gz" } ] }