{
"info": {
"author": "Devon Terrell",
"author_email": "devon@axioms.me",
"bugtrack_url": null,
"classifiers": [],
"description": "Fuzzy Information Retrieval for Musical Scores (Firms)\n======================================================\n\nFirms is an IR system designed for performing fuzzy searches against a\ncorpus of musical scores. Users provide snippets of a musical work using\na common digital music representation (e.g. MusicXML) and Firms compares\nit against a corpus of pre-indexed musical scores to efficiently rank\nand return results. The retrieval process is \"fuzzy\" because Firms is\ndesigned to be resilient against several common sources of transcription\nerror, from simple typos, to aural rythmic ambiguity and key\ntransposition.\n\nSetup\n-----\n\nFirms is hosted on the Python Package Index (Pypi) and can be installed\nalong with all required dependencies with the ``pip`` command:\n\n ``pip install firms``\n\nUsage\n-----\n\nFirms can be used as either a library (by importing modules from the\n``firms`` namespace) or through a command line interface (CLI). The CLI\nis defined in ``firms_cli.py`` and can be run interactively to explore\navailable commands and options:\n\n ``firms``\n\nThis will display a list of available commands. To see more detail about\na particular command or group of commands, append ``--help`` to the\ncommand. For example, to see more detail about initializing a Firms\nindex, run\n\n ``firms create --help``\n\nor for different options for adding pieces to the index:\n\n ``firms add --help``\n\nMany commands have subcommands supporting related operations. For\ninstance, to add all MusicXML pieces from a directory, use the\n``add dir`` subcommand:\n\n ``firms add dir /path/to/directory``\n\nAt a broad level, the CLI offers the following features:\n\n#. Create a Firms index\n#. Add pieces to the index by specifying a composer, a path to a valid\n MusicXML file, or by adding all pieces in the music21 corpus\n#. Show general information about the stored data\n#. Query for a piece by providing an example via MusicXML or in\n tinynotation\n#. Show a piece as MusicXML\n#. Output the midi version of a piece\n#. Evaluate the system by randomly selecting sections from the music21\n corpus and probabilistically introducing various types of errors\n\nExamples\n--------\n\nThe ``./examples`` directory contains a number of example music scores\nthat can be added to the system for demo purposes. Each of these pieces\nis in the public domain and are available at\n`OpenMusicScore `__. To add these pieces to\nthe index, run\n\n ``firms add dir \"examples\"``\n\nEach piece can be queried using the CLI method:\n\n ``firms query tiny ``\n\nReplace ```` in the command above with the query\ncorresponding to the piece in the table below. This query format is a\nsimple ASCII-based notation for representing small snippets of music\nusing standard western notation.\n\n More information on tinynotation can be found `in the music21\n documentation `__.\n\n============================== =====================================================================================================================\nPiece Query \n============================== =====================================================================================================================\nAmazing Grace tinynotation: g8 c'2 e'8 c' e'2 d'4 c'2 a4 g2 g4 c'2 e'8 c' e'2 d'4 g'2.~ g'2 e'4 g'4.~ e'8 g' e' c'2 g4 a4. c'8 c' a \nEntertainer tinynotation: d''16 e'' c'' a' a' b' g'8 d'16 e' c' a a b g8 d16 e c A A B A A- G8 \nMarch of the Wooden Soldiers tinynotation: d'8 r d' r b8. a#16 b8 r16 c'#16 d'8 r d' r b8. a#16 b8 r16 c'#16 \nOde to Joy tinynotation: b b c' d' d' c' b a g g a b b4. a8 a2\nDeck the Halls tinynotation: d'4. c'8 b4 a g a b g a8 b c' a b4. a8 g4 f# g2\n============================== =====================================================================================================================\n\nFor example, to run the query for *Entertainer* from the table above,\nrun:\n\n ``firms query tiny \"tinynotation: d''16 e'' c'' a' a' b' g'8 d'16 e' c' a a b g8 d16 e c A A B A A- G8\"``\n\nIn addition, an XML sample of \"Ode to Joy\" is provided in the\n``examples`` directory, and can be used like so:\n\n ``firms query xml examples/ode-to-joy.query.xml``\n\nExamples with Errors\n~~~~~~~~~~~~~~~~~~~~\n\nThe table below contains versions of the sample queries above with a\ndifferent type of musical error included.\n\n============================== =============================================================================== ===========\nPiece Query Error Type\n============================== =============================================================================== ===========\nAmazing Grace tinynotation: g8 c'2 e'8 c' e'2 d'4 c'2 a4 g2 g4 g4 c'2 e'8 c' e'2 d'4 g'2. g'2 Extra Note \nEntertainer tinynotation: d''16 e'' c'' a' a' b' g' d' e' c' a a b g8 d16 e c A A B A A- G8\" Wrong Note \nMarch of the Wooden Soldiers tinynotation: d'8 r d' b8. a#16 b8 r16 c'#16 d'8 r d' r b8. a#16 b8 r16 c'#16 Missing Note \nOde to Joy tinynotation: d' d' e'- f' f' e'- d' c' b- b- c' d' d'4. c'8 c'2 Transposed \nDeck the Halls tinynotation: d'2. c'4 b2 a g a b g a4 b c' a b2. a4 g2 f# g1 Stretched Rhythm \n============================== =============================================================================== ===========\n\nFIRMs is designed to be accomodate some level of error in the user\ninput.\n\nRepeated sections\n~~~~~~~~~~~~~~~~~\n\nFailing to trace through repeated sections of music causes issues for\nsongs with a heavily repetative structure. The performance of Firms is\nhighly dependent on how these types of songs are notated. Explicitly\nwriting out repeated sections in a flat format greatly improves the\nperformance. This can be seen in the *Amazing Grace* query in the\n\"Examples\" section above. This example contains the main theme of the\nsong, but the BM25 method fails to score it highly because the repeated\nsections are ignored from the original score. The \"Amazing Grace with\nDrums Explicit Repeat\" example is an alternate engraving of the \"Amazing\nGrace with Drums\" score with repeated sections written out linearly, as\nthey would be heard by an audience. This example scores *higher* than\nthe original version because the repeats are effectively captured.\n\nFirms can automatically expand repeated sections during the indexing\nprocess. The various ``add`` commands take a boolean flag to enable the\nconversion:\n\n ``firms add dir ./examples --explicit_repeats True``\n\nNote that this process can slow down ingest time significantly. If a\npiece does not contain any repeated sections, or the repeated sections\nare malformed in some way, the following error message will be shown,\nand the ingestion process will continue with the original unexpanded\nversion:\n\n Unable to expand piece. Continuing with original\n\nEvaluation\n----------\n\nIn addition to the examples shown above, the FIRMs CLI includes a\ncommand for performing random probabilistic evaluation by sampling the\npieces included in the index.\n\nTo run an evaluation, first add some pieces to the corpus. Note, this\ncommand may take some time (~5 minutes on my laptop) as it adds over 400\npieces to the index.\n\n ``firms add composer bach --filetype xml``\n\nThen run an evaluation, specifying the number of samples to take. Note,\nthis may take some time to complete (~15 minutes for my laptop). Try\n``--n 10`` for a faster result (~1.5 min).\n\n ``firms exaluate --n 100 --noprint True``\n\nThis will take 100 samples of various lengths from the pieces available\nin the FIRMs index, perform a search based on the sample, and collect\nstatistics on the average rank of the correct result. The\n``--noprint True`` option skips printing the individual query results\ntables, while still printing the aggregate true-positive ranking\nstatistic. For exmaple, the results on my run were as follows:\n\n::\n\n Statistics for BM25\n nobs: 100\n minmax: (0, 7)\n mean: 0.19\n variance: 0.882727272727\n skewness: 6.297878668097064\n kurtosis: 40.1179051948864\n Statistics for LogWeightedSumGrader\n nobs: 100\n minmax: (0, 26)\n mean: 0.42\n variance: 6.85212121212\n skewness: 9.47574350344239\n kurtosis: 90.04614836416702\n\nThis shows statistics on the ranks of true-positive results, broken down\nby the grading methods used. The field ``nobs`` represents the total\nnumber of observations. The ``minmax`` field shows the minimum and\nmaximum ranks for true-positive results. The ``mean``, ``variance``,\n``skewness``, and ``kurtosis`` fields are statistics calculated based on\nthe the ranks.\n\nWhile this is interesting, it is not overly representative of realistic\nqueries. First, these snippets are randomly selected, whereas users are\nmore likely to enter memorable melodic lines and themes. Second, users\nare likely to include errors in their entries, either due to\ntranscription or due to ambiguities in how a particular piece is\nnotated.\n\nTackling the first issue is beyond the scope of this project, but the\n``evaluate`` method includes a number of parameters for\nprobabilistically introducing errors into the sample queries.\n\n ``firms evaluate --n 100 --erate .2``\n\nThe ``--erate .2`` parameter gives each snippet a 20% chance of\nincluding an error. The type of error chosen is controlled by the\nparameters ``--transposition_error``, ``--replace_note_error``,\n``--remove_note_error``, and ``--add_note_error``. These are decimal\nvalues between [0, 1) and should add up to 1, thus representing a\nprobability distribution. By default, they are each set to ``.25`` to\npresent an equal probability.\n\nOften we're more concerned with whether the true-positive result is\nwithin the top K results returned, such as the first page of a search\nengine. To quantify this, we can configure the evaluation scorer to\ntreat all results below K as a 0, while maintaining the rank of results\nbeyond that.\n\n ``firms evaluate --n 100 --topk 10``\n\nThis allows the system to be a little more flexible defining what it\nconsiders to be a correct result.\n\nArchitecture\n------------\n\nAt a fundamental level, Firms operates primarily on the concept of\n*stemming*. Each piece is broken into a number of small sections called\n*snippets*. These snippets are passed through several stemmers, each of\nwhich produces one or more *stems* capturing a particular dimension of\nthe snippet. For example, a stem may capture the pitches, rhythms, or\ncontour of notes within snippet. These stems are persisted in an index\nfor efficient lookup.\n\nWhen a user enters a query, the query is passed through the same\nprocess, first breaking it up into snippets, then passing each snippet\nthrough the same stemmers. The resulting stems are looked up in the\npre-constructed index, returning a list of locations within each piece\nthat match the given snippet. From there, the results may be aggregated\nusing one of several scoring mechanisms.\n\nThe implementation provides two built-in scoring mechanisms. The first\nis a log weighted sum of counts. The matching stems for each stemmer\ntype are aggregated by taking the natural logarithm of the count, then\nmultiplied by a per-stemmer weight value, and finally summed together to\nform the final grade. The second is a standard Okapi BM25 implementation\nwithout document length normalization. Two potential measures of\ndocument length which may improve the accuracy of this method are the\ntotal number of measures for a piece or the total number of snippets in\na piece. A measure based approach ignores the density of notes within a\nmeasure: a measure with a single whole note would be weighted the same\nas a measure with a melodic line of sixteenth notes. The snippet count\napproach would disproportionally impact pieces with many parts or\nvoices, as each part and voice acts as a multiplier for the number of\nsnippets contained in a piece. For these reasons, document length\nnormalization was not inlcuded for this project.\n\nThis particular implementation uses a local SQLite database to store the\npre-computed snippets, stems, and other information as a flat-file\nrelational structure. Each stemmer type is an instance of the abstract\n``FirmIndex`` class, which hides the details of the storage mechanism\nused. This allows the underlying SQLite implementation to be swapped out\nfor a more efficient storage mechanism without impacting the rest of the\nsystem.\n\nScaling and Improvements\n~~~~~~~~~~~~~~~~~~~~~~~~\n\nOne interesting side-effect of the chosen architecture is that the\napplciation may be trivially scaled by hosting multiple instances behind\na load balancer. On insert, an arbitrary instance could be chosen to\nstore the piece. On query, a scatter-gather approach could pass the\nquery to each instance, and the final results streamed back to the load\nbalancer for aggregation. This approach would enable parallel persistent\nstorage IO on each instance. With some further modification, each\ninstance could be configured to locally aggregate results before passing\nthem on for final aggregation.\n\nThere are many musical aspects not captured by the current\nimplementation, including:\n\n- Unpitched notes, e.g. percussion\n- Tied notes\n- Non-traditional western music notation\n\nDevelopment\n-----------\n\nConfigure .pypirc file with:\n\n::\n\n [distutils]\n index-servers =\n pypi\n pypitest\n\n [pypi]\n username=user-name\n password=user-password\n\n [pypitest]\n username=user-name\n password=user-password\n\nThen to create a new version:\n\n#. ``git commit``\n#. Update setup.py ``version`` and ``download_url``\n#. ``git tag `` and ``git push --tags``\n#. ``python setup.py sdist upload -r pypitest``\n#. ``pip install --upgrade firms --no-cache-dir``\n",
"description_content_type": null,
"docs_url": null,
"download_url": "https://github.com/axiomabsolute/cs410-information-retrieval/archive/0.0.11.tar.gz",
"downloads": {
"last_day": -1,
"last_month": -1,
"last_week": -1
},
"home_page": "https://github.com/axiomabsolute/cs410-information-retrieval",
"keywords": "",
"license": "MIT",
"maintainer": "",
"maintainer_email": "",
"name": "firms",
"package_url": "https://pypi.org/project/firms/",
"platform": "",
"project_url": "https://pypi.org/project/firms/",
"project_urls": {
"Download": "https://github.com/axiomabsolute/cs410-information-retrieval/archive/0.0.11.tar.gz",
"Homepage": "https://github.com/axiomabsolute/cs410-information-retrieval"
},
"release_url": "https://pypi.org/project/firms/0.0.11/",
"requires_dist": null,
"requires_python": "",
"summary": "Fuzzy Information Retrieval for Musical Scores",
"version": "0.0.11"
},
"last_serial": 3430038,
"releases": {
"0.0.1": [
{
"comment_text": "",
"digests": {
"md5": "4e4065a955f6b10672d66a74cd629f79",
"sha256": "85e2f34095ffdbd5afb952f6066268f47146293c91fdfddafbf2239ffaa3eff4"
},
"downloads": -1,
"filename": "firms-0.0.1.tar.gz",
"has_sig": false,
"md5_digest": "4e4065a955f6b10672d66a74cd629f79",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 14985,
"upload_time": "2017-12-19T19:54:36",
"url": "https://files.pythonhosted.org/packages/44/d3/8bef0319ef4313010d52de221fe8553632b2ebfee32259d4a86aed789b19/firms-0.0.1.tar.gz"
}
],
"0.0.10": [
{
"comment_text": "",
"digests": {
"md5": "6d964ecf3937a7566de6e123a5114c64",
"sha256": "09ac351ff17e1a4d967b3566d169101e64dd9e26acaaec249f796be65da16957"
},
"downloads": -1,
"filename": "firms-0.0.10.tar.gz",
"has_sig": false,
"md5_digest": "6d964ecf3937a7566de6e123a5114c64",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 22929,
"upload_time": "2017-12-20T01:41:08",
"url": "https://files.pythonhosted.org/packages/20/6a/1c6e6aa9157f9b82843ca45e3378dee6e77dba037291375b4723ac5dc9fe/firms-0.0.10.tar.gz"
}
],
"0.0.11": [
{
"comment_text": "",
"digests": {
"md5": "b87aa6fe26e703fb7b3ec66dc2dfa5bd",
"sha256": "3dab740b3392de099ccc3214ea3c6cc5691f80ca2e45b505765ff5b26a6dccc6"
},
"downloads": -1,
"filename": "firms-0.0.11.tar.gz",
"has_sig": false,
"md5_digest": "b87aa6fe26e703fb7b3ec66dc2dfa5bd",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 22935,
"upload_time": "2017-12-20T02:03:56",
"url": "https://files.pythonhosted.org/packages/b3/e3/81b817d54a1058a907b2d1aa0c6b910acd2e6b818beddec5773d48b5052e/firms-0.0.11.tar.gz"
}
],
"0.0.2": [
{
"comment_text": "",
"digests": {
"md5": "99697c8ff0e9a54e06bd224e43509497",
"sha256": "e87e0f1420da3894dd3484006ffabe4b3cc70e2be4b3c3523d13a73d3d5811e6"
},
"downloads": -1,
"filename": "firms-0.0.2.tar.gz",
"has_sig": false,
"md5_digest": "99697c8ff0e9a54e06bd224e43509497",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 20606,
"upload_time": "2017-12-19T20:10:53",
"url": "https://files.pythonhosted.org/packages/ec/15/0031adf9e2c28e9d52a973c18aef7104b2f42881806eea2a81ca21dad4fe/firms-0.0.2.tar.gz"
}
],
"0.0.3": [
{
"comment_text": "",
"digests": {
"md5": "d7e607ffe38920b788fd733fa830afb9",
"sha256": "3398c9d55eb527f8ce46aee807b1b9dec262afb4aa0b9cdc2e83c07da994fe27"
},
"downloads": -1,
"filename": "firms-0.0.3.tar.gz",
"has_sig": false,
"md5_digest": "d7e607ffe38920b788fd733fa830afb9",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 20601,
"upload_time": "2017-12-19T20:16:27",
"url": "https://files.pythonhosted.org/packages/63/1d/62ad794f5574c2c12b3332abddffd779f0582e493abcc823ea82209718b0/firms-0.0.3.tar.gz"
}
],
"0.0.4": [
{
"comment_text": "",
"digests": {
"md5": "3efe0e6cd9d9e1e2a3dcf6a73fd20dd3",
"sha256": "fb8dbc1ad3d24ba137685378e63737d58e497e96cb87b1c05fc50aff50e6555c"
},
"downloads": -1,
"filename": "firms-0.0.4.tar.gz",
"has_sig": false,
"md5_digest": "3efe0e6cd9d9e1e2a3dcf6a73fd20dd3",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 20602,
"upload_time": "2017-12-19T20:25:47",
"url": "https://files.pythonhosted.org/packages/34/a4/1529cc1c8d593a7a305c3c6842a56da6f9a3c07235bfe9683fedb0efbf17/firms-0.0.4.tar.gz"
}
],
"0.0.5": [
{
"comment_text": "",
"digests": {
"md5": "dd65ce276a29af637b8f9b07264fbc74",
"sha256": "2cd3bddc8dfa5167247c0ab662d13b655c4190a47230097ab610dd3ddf1ef8d4"
},
"downloads": -1,
"filename": "firms-0.0.5.tar.gz",
"has_sig": false,
"md5_digest": "dd65ce276a29af637b8f9b07264fbc74",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 20582,
"upload_time": "2017-12-19T20:27:49",
"url": "https://files.pythonhosted.org/packages/7f/ba/5f6614e879da0cb4ea73906c80ea0129fae28abc115b3b17878413434ec2/firms-0.0.5.tar.gz"
}
],
"0.0.6": [
{
"comment_text": "",
"digests": {
"md5": "780c28fc82bb3799587c006bf95560c6",
"sha256": "fbe33ebbea55cda82c6dcb96bf278d9ac1219b21d20bcf57085f168efc74b6b3"
},
"downloads": -1,
"filename": "firms-0.0.6.tar.gz",
"has_sig": false,
"md5_digest": "780c28fc82bb3799587c006bf95560c6",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 20583,
"upload_time": "2017-12-19T20:30:20",
"url": "https://files.pythonhosted.org/packages/be/99/e98e19a4e25b0aa2dfd37b910c1931f8b9ee6ccbd02f226c51753cfc308a/firms-0.0.6.tar.gz"
}
],
"0.0.7": [
{
"comment_text": "",
"digests": {
"md5": "00e67114bbe2b50cc16d2c3acb176124",
"sha256": "2aebf594b7eb0abd6d584dc57d61a6ba1dead1fe25916946c79e48b94c5b0706"
},
"downloads": -1,
"filename": "firms-0.0.7.tar.gz",
"has_sig": false,
"md5_digest": "00e67114bbe2b50cc16d2c3acb176124",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 20782,
"upload_time": "2017-12-19T20:41:14",
"url": "https://files.pythonhosted.org/packages/82/b9/063f63c8453fa78c69ce2bf01758dd7169940f4b886d40bd01c70ba51c10/firms-0.0.7.tar.gz"
}
],
"0.0.8": [
{
"comment_text": "",
"digests": {
"md5": "93bf8c3b5685a32aaba21fc1995a1bd6",
"sha256": "8e505b04ec5957972ca8c2ce33791842ab4789fd966e966886e150727ff95008"
},
"downloads": -1,
"filename": "firms-0.0.8.tar.gz",
"has_sig": false,
"md5_digest": "93bf8c3b5685a32aaba21fc1995a1bd6",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 23265,
"upload_time": "2017-12-20T01:09:26",
"url": "https://files.pythonhosted.org/packages/1a/d2/d49a40b445f781495b3550004d072a0c8e025ed8c2a9eb53d6f0e11b076e/firms-0.0.8.tar.gz"
}
],
"0.0.9": [
{
"comment_text": "",
"digests": {
"md5": "de7edf6cef8946ffecb1c440501c3891",
"sha256": "6563880bb3d4fc90b5575e0cb3ea357676d73dde1f767b3ee9531da90093e618"
},
"downloads": -1,
"filename": "firms-0.0.9.tar.gz",
"has_sig": false,
"md5_digest": "de7edf6cef8946ffecb1c440501c3891",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 23843,
"upload_time": "2017-12-20T01:17:42",
"url": "https://files.pythonhosted.org/packages/ee/19/b2ad03535693218dcb5a267106dedba30b0e67244bbc2688c1a4a01e866d/firms-0.0.9.tar.gz"
}
]
},
"urls": [
{
"comment_text": "",
"digests": {
"md5": "b87aa6fe26e703fb7b3ec66dc2dfa5bd",
"sha256": "3dab740b3392de099ccc3214ea3c6cc5691f80ca2e45b505765ff5b26a6dccc6"
},
"downloads": -1,
"filename": "firms-0.0.11.tar.gz",
"has_sig": false,
"md5_digest": "b87aa6fe26e703fb7b3ec66dc2dfa5bd",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 22935,
"upload_time": "2017-12-20T02:03:56",
"url": "https://files.pythonhosted.org/packages/b3/e3/81b817d54a1058a907b2d1aa0c6b910acd2e6b818beddec5773d48b5052e/firms-0.0.11.tar.gz"
}
]
}