{ "info": { "author": "Lukas Lalinsky", "author_email": "lalinsky@gmail.com", "bugtrack_url": null, "classifiers": [ "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Programming Language :: Python :: 2", "Programming Language :: Python :: 3" ], "description": "##########################\nMusicBrainz Database Tools\n##########################\n\n|pypi badge| |ci badge|\n\n.. |pypi badge| image:: https://badge.fury.io/py/mbdata.svg\n :target: https://badge.fury.io/py/mbdata\n\n.. |ci badge| image:: https://code.oxygene.sk/lukas/mbdata/badges/master/pipeline.svg\n :target: https://code.oxygene.sk/lukas/mbdata/commits/master\n\n****************************\nMusicBrainz Database Replica\n****************************\n\nThis repository now contains a collection of scripts for managing a\nreplica of the MusicBrainz database. These used to be called \"mbslave\",\nbut have been moved to this repository.\n\nThe main motivation for these scripts is to be able to customize\nyour database. If you don't need such custimizations, it might be\neasier to use the replication tools provided by MusicBrainz itself.\n\nInstallation\n============\n\n1. Make sure you have `Python `__ and `psycopg2 `__ installed.\n On Debian and Ubuntu, that means installing these packages::\n\n sudo apt install python python-pip python-psycopg2\n pip install -U mbdata # if you don't have it installed already\n\n The command will install the ``mbslave`` script into ``$HOME/.local/bin``.\n\n2. Get an API token on the `MetaBrainz website `__.\n\n3. Create mbslave.conf by copying and editing `mbslave.conf.default `__::\n\n curl https://raw.githubusercontent.com/lalinsky/mbdata/master/mbslave.conf.default -o mbslave.conf\n vim mbslave.conf\n\n By default, the ``mbslave`` script will look for the config file in the current directory.\n If you want it to find it from anywhere, either save it to ``/etc/mbslave.conf`` or\n set the ``MBSLAVE_CONFIG`` environment variable. For example:::\n\n export MBSLAVE_CONFIG=/usr/local/etc/mbslave.conf\n\n4. Setup the database. If you are starting completely from scratch,\n you can use the following commands to setup a clean database::\n\n sudo su - postgres\n createuser musicbrainz\n createdb -l C -E UTF-8 -T template0 -O musicbrainz musicbrainz\n createlang plpgsql musicbrainz\n psql musicbrainz -c 'CREATE EXTENSION cube;'\n psql musicbrainz -c 'CREATE EXTENSION earthdistance;'\n\n5. Prepare empty schemas for the MusicBrainz database and create the table structure::\n\n echo 'CREATE SCHEMA musicbrainz;' | mbslave psql -S\n echo 'CREATE SCHEMA statistics;' | mbslave psql -S\n echo 'CREATE SCHEMA cover_art_archive;' | mbslave psql -S\n echo 'CREATE SCHEMA wikidocs;' | mbslave psql -S\n echo 'CREATE SCHEMA documentation;' | mbslave psql -S\n\n mbslave psql -f CreateTables.sql\n mbslave psql -f statistics/CreateTables.sql\n mbslave psql -f caa/CreateTables.sql\n mbslave psql -f wikidocs/CreateTables.sql\n mbslave psql -f documentation/CreateTables.sql\n\n6. Download the MusicBrainz database dump files from\n http://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/\n\n7. Import the data dumps, for example::\n\n mbslave import mbdump.tar.bz2 mbdump-derived.tar.bz2\n\n8. Setup primary keys, indexes and views::\n\n mbslave psql -f CreatePrimaryKeys.sql\n mbslave psql -f statistics/CreatePrimaryKeys.sql\n mbslave psql -f caa/CreatePrimaryKeys.sql\n mbslave psql -f wikidocs/CreatePrimaryKeys.sql\n mbslave psql -f documentation/CreatePrimaryKeys.sql\n\n mbslave psql -f CreateIndexes.sql\n mbslave psql -f CreateSlaveIndexes.sql\n mbslave psql -f statistics/CreateIndexes.sql\n mbslave psql -f caa/CreateIndexes.sql\n\n mbslave psql -f CreateFunctions.sql\n mbslave psql -f CreateViews.sql\n\n9. Vacuum the newly created database (optional)::\n\n echo 'VACUUM ANALYZE;' | mbslave psql\n\nReplication\n===========\n\nAfter the initial database setup, you might want to update the database with the latest data.\nThe `mbslave sync` script will fetch updates from MusicBrainz and apply it to your local database::\n\n mbslave sync\n\nIn order to update your database regularly, add a cron job like this that runs every hour::\n\n 15 * * * * mbslave sync >>/var/log/mbslave.log\n\nSchema Upgrade\n==============\n\nWhen the MusicBrainz database schema changes, the replication will stop working.\nThis is usually announced on the `MusicBrainz blog `__.\nWhen it happens, you need to upgrade the database.\n\nRelease 2019-05-14 (25)\n~~~~~~~~~~~~~~~~~~~~~~~\n\nRun the upgrade scripts::\n\n mbslave psql -f updates/schema-change/25.slave.sql\n echo 'UPDATE replication_control SET current_schema_sequence = 25;' | mbslave psql\n\nRelease 2017-05-25 (24)\n~~~~~~~~~~~~~~~~~~~~~~~\n\nRun the upgrade scripts::\n\n mbslave psql -f updates/schema-change/24.slave.sql\n echo 'UPDATE replication_control SET current_schema_sequence = 24;' | mbslave psql\n\nTips and Tricks\n===============\n\nSingle Database Schema\n~~~~~~~~~~~~~~~~~~~~~~\n\nMusicBrainz used a number of schemas by default. If you are embedding the MusicBrainz database into\nan existing database for your application, it's convenient to merge them all into a single schema.\nThat can be done by changing your config like this::\n\n [schemas]\n musicbrainz=musicbrainz\n statistics=musicbrainz\n cover_art_archive=musicbrainz\n wikidocs=musicbrainz\n documentation=musicbrainz\n\nAfter this, you only need to create the \"musicbrainz\" schema and import all the tables there.\n\nFull Import Schema Upgrade\n~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nYou can use the schema mapping feature to do zero-downtime upgrade of the database with full\ndata import. You can temporarily map all schemas to e.g. \"musicbrainz_NEW\", import your new\ndatabase there and then rename it::\n\n echo 'BEGIN; ALTER SCHEMA musicbrainz RENAME TO musicbrainz_OLD; ALTER SCHEMA musicbrainz_NEW RENAME TO musicbrainz; COMMIT;' | mbslave psql -S\n\n*****************\nSQLAlchemy Models\n*****************\n\nIf you are developing a Python application that needs access to the\n`MusicBrainz `__\n`data `__, you can use\nthe ``mbdata.models`` module to get\n`SQLAlchemy `__ models mapped to the\nMusicBrainz database tables.\n\nAll tables from the MusicBrainz database are mapped, all foreign keys\nhave one-way relationships set up and some models, where it's essential\nto access their related models, have two-way relationships (collections)\nset up.\n\nIn order to work with the relationships efficiently, you should use the\nappropriate kind of `eager\nloading `__.\n\nExample usage of the models:\n\n.. code:: python\n\n >>> from sqlalchemy import create_engine\n >>> from sqlalchemy.orm import sessionmaker\n >>> from mbdata.models import Artist\n >>> engine = create_engine('postgresql://musicbrainz:musicbrainz@127.0.0.1/musicbrainz', echo=True)\n >>> Session = sessionmaker(bind=engine)\n >>> session = Session()\n >>> artist = session.query(Artist).filter_by(gid='8970d868-0723-483b-a75b-51088913d3d4').first()\n >>> print artist.name\n\nIf you use the models in your own application and want to define foreign\nkeys from your own models to the MusicBrainz schema, you will need to\nlet ``mbdata`` know which metadata object to add the MusicBrainz tables\nto:\n\n.. code:: python\n\n from sqlalchemy.ext.declarative import declarative_base\n Base = declarative_base()\n\n # this should be the first place where you import anything from mbdata\n import mbdata.config\n mbdata.config.configure(base_class=Base)\n\n # now you can import and use the mbdata models\n import mbdata.models\n\nYou can also use ``mbdata.config`` to re-map the MusicBrainz schema\nnames, if your database doesn't follow the original structure:\n\n.. code:: python\n\n import mbdata.config\n mbdata.config.configure(schema='my_own_mb_schema')\n\nIf you need sample MusicBrainz data for your tests, you can use\n``mbdata.sample_data``:\n\n.. code:: python\n\n from mbdata.sample_data import create_sample_data\n create_sample_data(session)\n\n********\nHTTP API\n********\n\n**Note:** This is very much a work in progress. It is not ready to use\nyet. Any help is welcome.\n\nThere is also a HTTP API, which you can use to access the MusicBrainz\ndata using JSON or XML formats over HTTP. This is useful if you want to\nabstract away the MusicBrainz PostgreSQL database.\n\nInstallation:\n\n.. code:: sh\n\n virtualenv --system-site-packages e\n . e/bin/activate\n pip install -r requirements.txt\n python setup.py develop\n\nConfiguration:\n\n.. code:: sh\n\n cp settings.py.sample settings.py\n vim settings.py\n\nStart the development server:\n\n.. code:: sh\n\n MBDATA_API_SETTINGS=`pwd`/settings.py python -m mbdata.api.app\n\nQuery the API:\n\n.. code:: sh\n\n curl 'http://127.0.0.1:5000/v1/artist/get?id=b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d'\n\nFor production use, you should use server software like\n`uWSGI `__ and\n`nginx `__ to run the service.\n\n**********\nSolr Index\n**********\n\nCreate a minimal Solr configuration:\n\n.. code:: sh\n\n ./bin/create_solr_home.py -d /tmp/mbdata_solr\n\nStart Solr:\n\n.. code:: sh\n\n cd /path/to/solr-4.6.1/example\n java -Dsolr.solr.home=/tmp/mbdata_solr -jar start.jar\n\n***********\nDevelopment\n***********\n\nNormally you should work against a regular PostgreSQL database with\nMusicBrainz data, but for testing purposes, you can use a SQLite\ndatabase with small data sub-set used in unit tests. You can create the\ndatabase using:\n\n.. code:: sh\n\n ./bin/create_sample_db.py sample.db\n\nThen you can change your configuration:\n\n.. code:: sh\n\n DATABASE_URI = 'sqlite:///sample.db'\n\nRunning tests:\n\n.. code:: sh\n\n nosetests -v\n\nIf you want to see the SQL queries from a failed test, you can use the\nfollowing:\n\n.. code:: sh\n\n MBDATA_DATABASE_ECHO=1 nosetests -v\n\nJenkins task that automatically runs the tests after each commit is\n`here `__.", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/lalinsky/mbdata", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "mbdata", "package_url": "https://pypi.org/project/mbdata/", "platform": "ALL", "project_url": "https://pypi.org/project/mbdata/", "project_urls": { "Homepage": "https://github.com/lalinsky/mbdata" }, "release_url": "https://pypi.org/project/mbdata/25.0.4/", "requires_dist": null, "requires_python": "", "summary": "MusicBrainz Database Tools", "version": "25.0.4" }, "last_serial": 5277994, "releases": { "25.0.0": [ { "comment_text": "", "digests": { "md5": "195e2c374160b6a39f235ae33bfa3d1c", "sha256": "a16f026155f501f40a75c7d3ebc78f53824b481c3de735dd1fdad0ce8039bb46" }, "downloads": -1, "filename": "mbdata-25.0.0.tar.gz", "has_sig": false, "md5_digest": "195e2c374160b6a39f235ae33bfa3d1c", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 350003, "upload_time": "2019-05-15T12:29:38", "url": "https://files.pythonhosted.org/packages/b9/f7/a192909a0d2fd7b073c40a20896abdc461a96899bf80cfec09b656bf7965/mbdata-25.0.0.tar.gz" } ], "25.0.1": [ { "comment_text": "", "digests": { "md5": "32287b2620ab5a6761dc5d5acfbf3680", "sha256": "97a73f6b2e007863ab762317d529fe64266e538bca45b167bcd8a87970316bbd" }, "downloads": -1, "filename": "mbdata-25.0.1.tar.gz", "has_sig": false, "md5_digest": "32287b2620ab5a6761dc5d5acfbf3680", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 351801, "upload_time": "2019-05-16T08:45:51", "url": "https://files.pythonhosted.org/packages/75/e8/664c375562aefdbd75b51934f8d4a9ad35c6e3da24f7be9b5cba6b9f9171/mbdata-25.0.1.tar.gz" } ], "25.0.2": [ { "comment_text": "", "digests": { "md5": "1231c1730434f3462a776148723227c2", "sha256": "ae3c50842d379227a0a60e89f324d6030fda707988415da4ee110ac66b038e3a" }, "downloads": -1, "filename": "mbdata-25.0.2.tar.gz", "has_sig": false, "md5_digest": "1231c1730434f3462a776148723227c2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 351797, "upload_time": "2019-05-16T09:06:51", "url": "https://files.pythonhosted.org/packages/c8/82/877d566a42ac9cb8f25eb6e7dba3cc7f5f9b37edd8272348a9adf90a76ed/mbdata-25.0.2.tar.gz" } ], "25.0.3": [ { "comment_text": "", "digests": { "md5": "e519b3930b2e88d2540c4981b642e7b4", "sha256": "8b4e9925879759b5564a79a1734104676a9fd3a409b6113d4571d90a5b0df280" }, "downloads": -1, "filename": "mbdata-25.0.3.tar.gz", "has_sig": false, "md5_digest": "e519b3930b2e88d2540c4981b642e7b4", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 351883, "upload_time": "2019-05-16T09:29:58", "url": "https://files.pythonhosted.org/packages/01/10/dc20bf65a75bfbbf1744009ca4f0e12dbe025974fc59e9058adf5c9eeeb0/mbdata-25.0.3.tar.gz" } ], "25.0.4": [ { "comment_text": "", "digests": { "md5": "a8c40637974fd26585543d058be072c7", "sha256": "1aab8797552e7e77881389b1aad4fe587458fb7e343d14410f2a662e9eaea081" }, "downloads": -1, "filename": "mbdata-25.0.4.tar.gz", "has_sig": false, "md5_digest": "a8c40637974fd26585543d058be072c7", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 351888, "upload_time": "2019-05-16T15:31:17", "url": "https://files.pythonhosted.org/packages/65/79/6c2462897fbae5d382cb91a302ab5e0a6d91a385842a69f8bafd5645b7e6/mbdata-25.0.4.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "a8c40637974fd26585543d058be072c7", "sha256": "1aab8797552e7e77881389b1aad4fe587458fb7e343d14410f2a662e9eaea081" }, "downloads": -1, "filename": "mbdata-25.0.4.tar.gz", "has_sig": false, "md5_digest": "a8c40637974fd26585543d058be072c7", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 351888, "upload_time": "2019-05-16T15:31:17", "url": "https://files.pythonhosted.org/packages/65/79/6c2462897fbae5d382cb91a302ab5e0a6d91a385842a69f8bafd5645b7e6/mbdata-25.0.4.tar.gz" } ] }