{ "info": { "author": "ID SIS \u2022 ETH Z\u00fcrich", "author_email": "swen@ethz.ch", "bugtrack_url": null, "classifiers": [ "License :: OSI Approved :: Apache Software License", "Operating System :: OS Independent", "Programming Language :: Python :: 3" ], "description": "# oBIS\noBIS is a command-line tool to handle dataSets that are too big to store in openBIS but still need to be registered and tracked in openBIS.\n\n## Prerequisites\n* python 3.6\n* git\n* git-annex [Installation guide](https://git-annex.branchable.com/install/)\n\n\n## Installation\n\n```\npip3 install obis\n```\n\nSince `obis` is based on `pybis`, the pip command will also install pybis and all its dependencies.\n\n## Usage\n\n### Help is your friend!\n\n```\n$ obis --help\nUsage: obis [OPTIONS] COMMAND [ARGS]...\n\nOptions:\n --version Show the version and exit.\n -q, --quiet Suppress status reporting.\n -s, --skip_verification Do not verify cerficiates\n -d, --debug Show stack trace on error.\n --help Show this message and exit.\n\nCommands:\n addref Add the given repository as a reference to openBIS.\n clone Clone the repository found in the given data set id.\n collection Get/set settings related to the collection.\n commit Commit the repository to git and inform openBIS.\n config Get/set configurations.\n data_set Get/set settings related to the data set.\n download Download files of a linked data set.\n init Initialize the folder as a data repository.\n init_analysis Initialize the folder as an analysis folder.\n move Move the repository found in the given data set id.\n object Get/set settings related to the object.\n removeref Remove the reference to the given repository from openBIS.\n repository Get/set settings related to the repository.\n settings Get all settings.\n status Show the state of the obis repository.\n sync Sync the repository with openBIS.\n```\n\nTo show detailed help for a specific command, type `obis --help` :\n\n```\n$ obis commit --help\nUsage: obis commit [OPTIONS] [REPOSITORY]\n\nOptions:\n -m, --msg TEXT A message explaining what was done.\n -a, --auto_add Automatically add all untracked files.\n -i, --ignore_missing_parent If parent data set is missing, ignore it.\n --help Show this message and exit.\n```\n\n\n## Settings\nWith `get` you retrieve one or more settings. If the `key` is omitted, you retrieve all settings of the `type`:\n\n```\nobis [type] [options] get [key]\n```\n\nWith `set` you set one or more settings:\n\n```\nobis [type] [options] set [key1]=[value1], [key2]=[value2], ...\n```\n\nWith `clear` you unset one or more settings:\n\n```\nobis [type] [options] clear [key1]\n```\n\nWith the type `settings` you can get all settings at once:\n\n```\nobis settings [options] get\n```\n\nThe option `-g` can be used to interact with the global settings. The global settings are stored in `~/.obis` and are copied to an obis repository when that is created.\n\nFollowing settings exist:\n\n| type | setting | description |\n| ---- | ------- | ----------- |\n| collection\t|\u00a0`id`\t|\tIdentifier of the collection the created data set is attached to. Use either this or the object id. |\n| config | `allow_only_https` | Default is true. If false, http can be used to connect to openBIS.\n| config | `fileservice_url` | URL for downloading files. See DownloadHandler / FileInfoHandler services.\n| config | `git_annex_backend` | Git annex backend to be used to calculate file hashes. Supported backends are SHA256E (default), MD5 and WORM.\n| config | `git_annex_hash_as_checksum` | Default is true. If false, a CRC32 checksum will be calculated for openBIS. Otherwise, the hash calculated by git-annex will be used.\n| config | `hostname` | Hostname to be used when cloning / moving a data set to connect to the machine where the original copy is located.\n| config | `openbis_url` | URL for connecting to openBIS (only protocol://host:port, without a path).\n| config | `obis_metadata_folder` | Absolute path to the folder which obis will use to store its metadata. If not set, the metadata will be stored in the same location as the data. This setting can be useful when dealing with read-only access to the data. The clone and move commands will not work when this is set.\n| config | `user` | User for connecting to openBIS.\n| data_set | `type` | Data set type of data sets created by obis.\n| data_set | `properties` | Data set properties of data sets created by obis.\n| object | `id` | Identifier of the object the created data set is attached to. Use either this or the collection id.\n| repository | `data_set_id` | This is set by obis. Is is the id of the most recent data set created by obis and will be used as the parent of the next one.\n| repository | `external_dms_id` | This is set by obis. Id of the external dms in openBIS.\n| repository | `id` | This is set by obis. Id of the obis repository.\n\nThe settings are saved within the obis repository, in the `.obis` folder, as JSON files, or in `~/.obis` for the global settings. They can be added/edited manually, which might be useful when it comes to integration with other tools.\n\n**Example `.obis/config.json`**\n\n```\n{\n \"fileservice_url\": null,\n \"git_annex_hash_as_checksum\": true,\n \"hostname\": \"bsse-bs-dock-5-160.ethz.ch\",\n \"openbis_url\": \"http://localhost:8888\"\n}\n```\n\n**Example `.obis/data_set.json`**\n\n```\n{\n \"properties\": {\n \"K1\": \"v1\",\n \"K2\": \"v2\"\n },\n \"type\": \"UNKNOWN\"\n}\n```\n\n## Commands\n\n**init**\n\n```\nobis init [folder]\n```\n\nIf a folder is given, obis will initialize that folder as an obis repository. If not, it will use the current folder.\n\n**init_analysis**\n\n```\nobis init_analysis [options] [folder]\n```\n\nWith init_analysis, a repository can be created which is derived from a parent repository. If it is called from within a repository, that will be used as a parent. If not, the parent has to be given with the `-p` option.\n\n**commit**\n\n```\nobis commit [options]\n```\n\nThe `commit` command adds files to a new data set in openBIS. If the `-m` option is not used to define a commit message, the user will be asked to provide one.\n\n**sync**\n\n```\nobis sync\n```\n\nWhen git commits have been done manually, the `sync` command creates the corresponding data set in openBIS. Note that, when interacting with git directly, use the git annex commands whenever applicable, e.g. use \"git annex add\" instead of \"git add\".\n\n**status**\n\n```\nobis status [folder]\n```\n\nThis shows the status of the repository folder from which it is invoked, or the one given as a parameter. It shows file changes and whether the repository needs to be synchronized with openBIS.\n\n**clone**\n\n```\nobis clone [options] [data_set_id]\n```\n\nThe `clone` command copies a repository associated with a data set and registers the new copy in openBIS. In case there are already multiple copied of the repository, obis will ask from which copy to clone. \n\n* To avoid user interaction, the copy index can be chosen with the option `-c`\n* With the option `-u` a user can be defined for copying the files from a remote system\n* By default, the file integrity is checked by calculating the checksum. This can be skipped with `-s`.\n\n*Note*: This command does not work when `obis_metadata_folder` is set.\n\n\n**move**\n\n```\nobis move [options] [data_set_id]\n```\n\nThe `move` command works the same as `clone`, except that the old repository will be removed.\n\nNote: This command does not work when `obis_metadata_folder` is set.\n\n**download**\n\n```\nobis download [options] [data_set_id]\n```\n\nThe `download` command downloads the files of a data set. Contrary to `clone`, this will not register another copy in openBIS. It is only for accessing files. This command requires the DownloadHandler / FileInfoHandler microservices to be running and the `fileservice_url` needs to be configured.\n\n**addref / removeref**\n\n```\nobis addref\nobis removeref\n```\n\nObis repository folders can be added or removed from openBIS. This can be useful when a repository was moved or copied without using the `move` or `copy` commands.\n\n## Examples\n\n**Create an obis repository and commit to openBIS**\n\n```\n# global settings to be use for all obis repositories\nobis config -g set openbis_url=https://localhost:8888\nobis config -g set user=admin\n# create an obis repository with a file\nobis init data1\ncd data1\necho content >> example_file\n# configure the repository\nobis data_set set type=UNKNOWN\nobis object set id=/DEFAULT/DEFAULT\n# commit to openBIS\nobis commit -m 'message'\n```\n\n**Commit to git and sync manually**\n\n```\n# assuming we are in a configured obis repository\necho content >> example_file\ngit annex add example_file\ngit commit -m 'message'\nobis sync\n```\n\n**Create an analysis repository**\n\n```\n# assuming we have a repository 'data1'\nobis init_analysis -p data1 analysis1\ncd analysis1\nobis data_set set type=UNKNOWN\nobis object set id=/DEFAULT/DEFAULT\necho content >> example_file\nobis commit -m 'message'\n```\n\n## Big Data Link Services\nThe Big Data Link Services can be used to download files which are contained in an obis repository. The services are included in the installation folder of openBIS, under `servers/big_data_link_services`. For how to configure and run them, consult the [README.md](https://sissource.ethz.ch/sispub/openbis/blob/master/big_data_link_server/README.md) file.\n\n## Rationale for obis\n\nData-provenance tracking tools like openBIS make it possible to understand and follow the research process. What was studied, what data was acquired and how, how was data analyzed to arrive at final results for publication -- this is information that is captured in openBIS. In the standard usage scenario, openBIS stores and manages data directly. This has the advantage that openBIS acts as a gatekeeper to the data, making it easy to keep backups or enforce access restrictions, etc. However, this way of working is not a good solution for all situations.\n\nSome research groups work with large amounts of data (e.g., multiple TB), which makes it inefficient and impractical to give openBIS control of the data. Other research groups require that data be stored on a shared file system under a well-defined directory structure, be it for historical reasons or because of the tools they use. In this case as well, it is difficult to give openBIS full control of the data.\n\nFor situations like these, we have developed `obis`, a tool for orderly management of data in conditions that require great flexibility. `obis` makes it possible to track data on a file system, where users have complete freedom to structure and manipulate the data as they wish, while retaining the benefits of openBIS. With `obis`, only metadata is actually stored and managed by openBIS. The data itself is managed externally, by the user, but openBIS is aware of its existence and the data can be used for provenance tracking. `obis` is packaged as a stand-alone utility, which, to be available, only needs to be added to the `PATH` variable in a UNIX or UNIX-like environment.\n\nUnder the covers, `obis` takes advantage of publicly available and tested tools to manage data on the file system. In particular, it uses `git` and `git-annex` to track the content of a dataset. Using `git-annex`, even large binary artifacts can be tracked efficiently. For communication with openBIS, `obis` uses the openBIS API, which offers the power to register and track all metadata supported by openBIS.\n\n\n## Literature\n\n V. Korolev, A. Joshi, V. Korolev, M.A. Grasso, A. Joshi, M.A. Grasso, et al., \"PROB: A tool for tracking provenance and reproducibility of big data experiments\", Reproduce '14. HPCA 2014, vol. 11, pp. 264-286, 2014.\n http://ebiquity.umbc.edu/_file_directory_/papers/693.pdf", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://sissource.ethz.ch/sispub/openbis/tree/master/obis", "keywords": "", "license": "Apache Software License Version 2.0", "maintainer": "", "maintainer_email": "", "name": "obis", "package_url": "https://pypi.org/project/obis/", "platform": "", "project_url": "https://pypi.org/project/obis/", "project_urls": { "Homepage": "https://sissource.ethz.ch/sispub/openbis/tree/master/obis" }, "release_url": "https://pypi.org/project/obis/0.2.1/", "requires_dist": null, "requires_python": ">=3.3", "summary": "Local data management with assistance from OpenBIS.", "version": "0.2.1" }, "last_serial": 5439858, "releases": { "0.2.0.dev1": [ { "comment_text": "", "digests": { "md5": "f9defe7a39bb7b976dbbee1291a02d2c", "sha256": "efd23884057cd8deef79ae0d8632ab0c9b6b13a6faf4de1ec4a5f4aeaab204ae" }, "downloads": -1, "filename": "obis-0.2.0.dev1.tar.gz", "has_sig": false, "md5_digest": "f9defe7a39bb7b976dbbee1291a02d2c", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.3", "size": 244276, "upload_time": "2019-06-24T09:29:26", "url": "https://files.pythonhosted.org/packages/61/28/eb1c1812360d6af1cd5990a7b1f09e773ee9f70deb2c1fe81471112e8965/obis-0.2.0.dev1.tar.gz" } ], "0.2.1": [ { "comment_text": "", "digests": { "md5": "b1339a61e3ce117f84f0311d5cee87ad", "sha256": "5680dee984802e714baab7f8af4d78c8b9c0953f035308556947d3dad67a7af1" }, "downloads": -1, "filename": "obis-0.2.1.tar.gz", "has_sig": false, "md5_digest": "b1339a61e3ce117f84f0311d5cee87ad", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.3", "size": 246742, "upload_time": "2019-06-24T09:37:54", "url": "https://files.pythonhosted.org/packages/ad/18/dbd2da7db26986362f73e790eff52e0e8690c484e4ffe2b91b41b62f50e9/obis-0.2.1.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "b1339a61e3ce117f84f0311d5cee87ad", "sha256": "5680dee984802e714baab7f8af4d78c8b9c0953f035308556947d3dad67a7af1" }, "downloads": -1, "filename": "obis-0.2.1.tar.gz", "has_sig": false, "md5_digest": "b1339a61e3ce117f84f0311d5cee87ad", "packagetype": "sdist", "python_version": "source", "requires_python": ">=3.3", "size": 246742, "upload_time": "2019-06-24T09:37:54", "url": "https://files.pythonhosted.org/packages/ad/18/dbd2da7db26986362f73e790eff52e0e8690c484e4ffe2b91b41b62f50e9/obis-0.2.1.tar.gz" } ] }