{ "info": { "author": "Dan Michael O. Hegg\u00f8", "author_email": "d.m.heggo@ub.uio.no", "bugtrack_url": null, "classifiers": [ "Programming Language :: Python", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7" ], "description": "# Almar · [![Travis](https://img.shields.io/travis/scriptotek/almar.svg)](https://travis-ci.org/scriptotek/almar)\n[![Codecov](https://img.shields.io/codecov/c/github/scriptotek/almar.svg)](https://codecov.io/gh/scriptotek/almar)\n[![Supported Python Versions](https://img.shields.io/pypi/pyversions/almar.svg)](https://pypi.python.org/pypi/almar)\n\nAlmar (formerly Lokar) is a script for batch editing and removing controlled\nclassification and subject heading fields (084/648/650/651/655) in bibliographic\nrecords in Alma using the Alma APIs. Tested with Python 2.7 and Python 3.5+.\n\nIt will use an SRU service to search for records, fetch and modify the MARCXML\nrecords and use the Alma Bibs API to write the modified records back to Alma.\n\nThe script will only work with fields having a vocabulary code defined in `$2`.\nSince the Alma SRU service does not provide search indexes for specific\nvocabularies, almar instead starts by searching using the `alma.subjects` + the\n`alma.authority_vocabulary` indices. This returns all records having a subject\nfield A with the given term and a subject field B with the given vocabulary\ncode, but where A is not necessarily equal to B, so almar filters the result\nlist to find the records where A is actually the same as B.\n\n[![asciicast](https://asciinema.org/a/Bf4E5NEoMmcMmsw2Z9Mxc1DNi.png)](https://asciinema.org/a/Bf4E5NEoMmcMmsw2Z9Mxc1DNi)\n\n## Installation and configuration\n\n1. Run `pip install -e .` to install `almar` and its dependencies.\n2. Create a configuration file. Almar will first look for `almar.yml` in the\n current directory, then for `lokar.yml` (legacy) and finally for `.almar.yml`\n in your home directory.\n\nHere's a minimal configuration file to start with:\n\n```\n---\ndefault_vocabulary: INSERT MARC VOCABULARY CODE HERE\n\nvocabularies:\n - marc_code: INSERT MARC VOCABULARY CODE HERE\n\ndefault_env: prod\n\nenv:\n - name: prod\n api_key: INSERT API KEY HERE\n api_region: eu\n sru_url: INSERT SRU URL HERE\n```\n\n1. Replace `INSERT MARC VOCABULARY CODE HERE` with the vocabulary code of\n your vocabulary (the `$2` value). The script uses this value as a filter,\n to ensure it only edits subject fields from the specified vocabulary.\n2. Replace `INSERT API KEY HERE` with the API key of your Alma instance. If\n you'r connected to a network zone, you should probably use a network zone key.\n Otherwise the edits will be stored as local edits in the institution zone.\n3. Optionally: Change api_region to 'na' (North America) or 'ap' (Asia Pacific).\n4. Replace `INSERT SRU URL HERE` with the URL to your SRU endpoint. Again: use\n the network zone endpoint if you're connected to a network zone. For Bibsys\n institutions, use `https://bibsys-k.alma.exlibrisgroup.com/view/sru/47BIBSYS_NETWORK`\n\nNote: In the file above, we've configured a single Alma environment called \"prod\".\nIt's possible to add multiple environments (for instance a sandbox and a\nproduction environment) and switch between them using the `-e` command line option.\nHere's an example:\n\n```\n---\ndefault_vocabulary: noubomn\n\nvocabularies:\n - marc_code: noubomn\n id_service: http://data.ub.uio.no/microservices/authorize.php?vocabulary=realfagstermer&term={term}&tag={tag}\n\ndefault_env: nz_prod\n\nenv:\n - name: nz_sandbox\n api_key: API KEY HERE\n api_region: eu\n sru_url: https://sandbox-eu.alma.exlibrisgroup.com/view/sru/47BIBSYS_NETWORK\n - name: nz_prod\n api_key: API KEY HERE\n api_region: eu\n sru_url: https://bibsys-k.alma.exlibrisgroup.com/view/sru/47BIBSYS_NETWORK\n```\n\nFor all configuration options, see\n[configuration options](https://github.com/scriptotek/lokar/wiki/Configuration-options).\n\n## Usage\n\nBefore using the tool, make sure you have set the vocabulary code (`vocabulary.marc_code`)\nfor the vocabulary you want to work with in the configuration file.\nThe tool will only make changes to fields having a `$2` value that matches\nthe `vocabulary.marc_code` code set in your configuration file.\n\nGetting help:\n\n* `almar -h` to show a list of command and general command line options\n* `almar replace -h` to show help for the \"replace\" subcommand\n\n### Replace a subject heading\n\nTo replace \"Term\" with \"New term\" in 650 fields:\n\n almar replace '650 Term' 'New term'\n\nor, since 650 is defined as the default field, you can also use the shorthand:\n\n almar replace 'Term' 'New term'\n\nTo work with any other field than the 650 field, the field number must be explicit:\n\n almar replace '655 Term' 'New term'`\n\nSupported fields are 084, 648, 650, 651 and 655.\n\n### Test things first with dry run\n\nTo see the changes made to each catalog record, add the `--diffs` flag. Combined\nwith the `--dry_run` flag (or `-d`), you will see the changes that would be made\nto the records without actually touching any records:\n\n almar replace --diffs --dry_run 'Term' 'New term'\n\nThis way, you can easily get a feel for how the tool works.\n\n### Move a subject to another MARC tag\n\nTo move a subject heading from 650 to 651:\n\n almar replace '650 Term' '651 Term'\n\nor you can use the shorthand\n\n almar replace '650 Term' '651'\n\nif the term itself is the same. You can also move and change a heading in\none operation:\n\n almar replace '650 Term' '651 New term'\n\n### Remove a subject heading\n\nTo remove all 650 fields having either `$a Term` or `$x Term`:\n\n almar remove '650 Term'\n\nor, since 650 is the default field, the shorthand:\n\n almar remove 'Term'\n\n### List documents\n\nIf you just want a list of documents without making any changes, use `almar list`:\n\n almar list '650 Term'\n\nOptionally with titles:\n\n almar list '650 Term' --titles\n\n\n### More complex edits\n\nFor more complex edits, such as replacing two subject headings with one,\nuse the `--rem` and `--add` options to remove and add subject headings.\nFor instance, to replace `Physics` AND `History` (655) with a single subject `History of physics`:\n\n almar --rem 'Physics' --rem '655 History' --add 'History of physics'\n\nNote that only records having *all* of the two subjects to be removed (the `--rem` subjects) will be modified.\nAny number of `--rem` and `--add` options is supported.\n\n\n### Interactive editing\n\nIf you need to split a concept into two or more concepts, you can use the\ninteractive mode.\nExample: to replace \"Kretser\" with \"Integrerte kretser\"\non some documents, but with \"Elektriske kretser\" on other, run:\n\n almar interactive 'Kretser' 'Integrerte kretser' 'Elektriske kretser'\n\nFor each record, Almar will print the title and subject headings and ask you\nwhich of the two headings to include on the record. Use the arrow keys and space\nto check one or the other, both or none of the headings, then press Enter to\nconfirm the selection and save the record.\n\n### Working with a custom document set\n\nBy default, `almar` will check all the documents returned from the following\nCQL query: `alma.subjects = \"{term}\" AND alma.authority_vocabulary = \"{vocabulary}\"`,\nbut you can use the `--cql` argument to specify a different query if you only\nwant to work with a subset of the documents. For instance,\n\n lokar --cql 'alma.all_for_ui = \"999707921404702201\"' --diffs replace 'Some subject' 'Some other subject'\n\nThe variables `{term}` and `{vocabulary}` can be used in the query string.\n\n## Notes\n\n* For terms consisting of more than one word, you must add quotation marks (single or double)\n around the term, as in the examples above. For single word terms, this is optional.\n* In search, the first letter is case insensitive. If you search for \"old term\", both\n \"old term\" and \"Old term\" will be replaced (but not \"old Term\").\n\n\n## Identifiers\n\nIdentifiers (`$0`) are added/updated if you configure a\n[ID lookup service URL](https://github.com/scriptotek/almar/wiki/Authority-ID-lookup-service)\n(`id_service`) in your configuration file. The service should accept\na GET request with the parameters `vocabulary`, `term` and `tag` and return the\nidentifier of the matched concept as a JSON object. See\n[this page](https://github.com/scriptotek/almar/wiki/Authority-ID-lookup-service)\nfor more details.\n\nFor an example service using [Skosmos](https://github.com/NatLibFi/Skosmos), see\n[code](https://github.com/scriptotek/data.ub.uio.no/blob/v2/www/default/microservices/authorize.php)\nand [demo](https://data.ub.uio.no/microservices/authorize.php?vocabulary=realfagstermer&term=Diagrambasert%20resonnering&tag=650).\n\n\n## Limited support for subject strings\n\nFour kinds of string operations are currently supported:\n\n* `almar remove 'Aaa : Bbb'` deletes occurances of `$a Aaa $x Bbb`\n* `almar replace 'Aaa : Bbb' 'Ccc : Ddd'` replaces `$a Aaa $x Bbb` with `$a Ccc $x Ddd`\n* `almar replace 'Aaa : Bbb' 'Ccc'` replaces `$a Aaa $x Bbb` with `$a Ccc` (replacing subfield `$a` and removing subfield `$x`)\n* `almar replace 'Aaa' 'Bbb : Ccc'` replaces `$a Aaa` with `$a Bbb $x $Ccc` (replacing subfield `$a` and adding subfield `$x`)\n\nNote: A term is only recognized as a string if there is space before and after colon (` : `).\n\n## More complex replacements\n\nTo make more complex replacements, we can use the advanced MARC syntax, where\neach argument is a complete MARC field using double `$`s as subfield delimiters.\n\nLet's start by listing documents having the subject \"Advanced Composition Explorer\"\nin our default vocabulary using the simple syntax:\n\n almar list 'Advanced Composition Explorer'\n\nTo get the same list using the advanced syntax, we would write:\n\n almar list '650 #7 $$a Advanced Composition Explorer $$2 noubomn'\n\nNotice that the quotation encapsulates the entire MARC field. And that we have explicitly\nspecified the vocabulary. This means we can make inter-vocabulary replacements.\nTo move the term to the \"bare\" vocabulary:\n\n almar replace '650 #7 $$a Advanced Composition Explorer $$2 noubomn' '610 27 $$a The Advanced Composition Explorer $$2 noubomn'\n\nWe also changed the Marc tag and the field indicators in the same process.\nWe could also include more subfields in the process:\n\n almar replace '650 #7 $$a Advanced Composition Explorer $$2 noubomn' '610 27 $$a The Advanced Composition Explorer $$2 noubomn $$0 (NO-TrBIB)99023187'\n\nNote that unlike simple search and replace, the order of the subfields does not matter when matching.\nExtra subfields do matter, however, except for `$0` and `$9`. To match any value (including no value)\nfor some subfield, use the value `{ANY_VALUE}`. Example:\n\n almar list --subjects '650 #7 $$a Sekvenseringsmetoder $$x {ANY_VALUE} $$2 noubomn'\n\n## Using it as a Python library\n\n```python\nfrom almar import SruClient, Alma\n\napi_region = 'eu'\napi_key = 'SECRET'\nsru_url = 'https://sandbox-eu.alma.exlibrisgroup.com/view/sru/47BIBSYS_NETWORK'\n\nsru = SruClient(sru_url)\nalma = Alma(api_region, api_key)\n\nquery = 'alma.authority_vocabulary=\"noubomn\"'\nfor record in sru.search(query):\n for subject in record.subjects(vocabulary='noubomn'):\n if not subject.find('subfield[@code=\"0\"]'):\n sa = subject.findtext('subfield[@code=\"a\"]')\n sx = subject.findtext('subfield[@code=\"x\"]')\n```\n\n## Development\n\nTo run tests:\n\n pip install -r test-requirements.txt\n py.test", "description_content_type": "text/markdown", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/scriptotek/almar", "keywords": "marc alma", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "almar", "package_url": "https://pypi.org/project/almar/", "platform": "", "project_url": "https://pypi.org/project/almar/", "project_urls": { "Homepage": "https://github.com/scriptotek/almar" }, "release_url": "https://pypi.org/project/almar/0.8.2/", "requires_dist": null, "requires_python": "", "summary": "Search and replace for subject fields in Alma records.", "version": "0.8.2" }, "last_serial": 5197856, "releases": { "0.7.6": [ { "comment_text": "", "digests": { "md5": "08d3b04677e1f379902a045749497cff", "sha256": "a3b1e6887c917897f1c580615159e99fd0a8c0d0703cad0d320d767ecab5986a" }, "downloads": -1, "filename": "almar-0.7.6.tar.gz", "has_sig": false, "md5_digest": "08d3b04677e1f379902a045749497cff", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 25473, "upload_time": "2019-04-08T21:47:08", "url": "https://files.pythonhosted.org/packages/0d/27/e43c429305759114649837c76ca5de0437358b241cc4fb991ef5cfd7c03e/almar-0.7.6.tar.gz" } ], "0.8.0": [ { "comment_text": "", "digests": { "md5": "79156b6fa3cd09305c6f5d86a30a60f9", "sha256": "f4ee9e5180389f52d08dd05cb386265661fdc2a7831d57b7a3f4560b867b4706" }, "downloads": -1, "filename": "almar-0.8.0.tar.gz", "has_sig": false, "md5_digest": "79156b6fa3cd09305c6f5d86a30a60f9", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 26921, "upload_time": "2019-04-09T16:04:36", "url": "https://files.pythonhosted.org/packages/e4/70/891c054739656b7f8fe6b87c794fa7b6d363894bcce87b577b4396ce66de/almar-0.8.0.tar.gz" } ], "0.8.1": [ { "comment_text": "", "digests": { "md5": "dadc2683b91bfb047d81c27362b895c1", "sha256": "67848030c3754c552ee2c82fdbcd50325d25efb7437d8497986ddccecd85f585" }, "downloads": -1, "filename": "almar-0.8.1.tar.gz", "has_sig": false, "md5_digest": "dadc2683b91bfb047d81c27362b895c1", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 26925, "upload_time": "2019-04-09T18:53:05", "url": "https://files.pythonhosted.org/packages/ab/0e/539e5c7dd7314947057e24adb2e5e1e5d42ddf32c94048edf6a650b12680/almar-0.8.1.tar.gz" } ], "0.8.2": [ { "comment_text": "", "digests": { "md5": "4141a99262ebea83c8ad4a9d27907765", "sha256": "7baef67c9ad502bd016f64f6b46c00829dbcee46efa148a3d72822a21e67624b" }, "downloads": -1, "filename": "almar-0.8.2.tar.gz", "has_sig": false, "md5_digest": "4141a99262ebea83c8ad4a9d27907765", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 26936, "upload_time": "2019-04-27T21:29:05", "url": "https://files.pythonhosted.org/packages/d4/8e/f0a6f5c6ff4874071242cdf1e8bd764367b4ee4854b9c41c23b40c7121f6/almar-0.8.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "4141a99262ebea83c8ad4a9d27907765", "sha256": "7baef67c9ad502bd016f64f6b46c00829dbcee46efa148a3d72822a21e67624b" }, "downloads": -1, "filename": "almar-0.8.2.tar.gz", "has_sig": false, "md5_digest": "4141a99262ebea83c8ad4a9d27907765", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 26936, "upload_time": "2019-04-27T21:29:05", "url": "https://files.pythonhosted.org/packages/d4/8e/f0a6f5c6ff4874071242cdf1e8bd764367b4ee4854b9c41c23b40c7121f6/almar-0.8.2.tar.gz" } ] }