{ "info": { "author": "Justin Winokur", "author_email": "Jwink3101@gmail.com", "bugtrack_url": null, "classifiers": [], "description": "list\\_dict\\_DB\n==============\n\nA simple *in memory* database like object. Replace list of dictionaries\nwith a fast, O(1) lookup, data structure. Also supports multiple items\nvia a list.\n\nIt is very simple and designed for small to medium databases which can\nexist purely in memory. It intended for situations where query speed is\nprioritized over memory usage and instantiation time.\n\nI built it because I was querying a list of dictionaries by multiple\nkeys *inside* of a loop. That cause an O(N^2) complexity and made my\nmoderately-sized 30,000 item list intractable!\n\nThere are other solutions out there that are more traditionally database\nfocused such as `TinyDB `__,\n`buzhug `__, etc, but this is purely\nin-memory and very fast. It is likely at the cost of extra memory\noverhead.\n\nIt passes all tests (with **100% test coverage**) on Python 2.7 and\nPython 3.6\n\nI am not a database expert. This simple met my needs.\n\nInstall\n-------\n\nSimply\n\n::\n\n pip install list_dict_DB\n\nUsage:\n------\n\nFor more full example usage, including the flexible query methods, see\nthe tests.\n\nConsider the following: (please do not argue accuracy. It is an example)\n\n::\n\n items = [\n {'first':'John', 'last':'Lennon','born':1940,'role':'guitar'},\n {'first':'Paul', 'last':'McCartney','born':1942,'role':'bass'},\n {'first':'George','last':'Harrison','born':1943,'role':'guitar'},\n {'first':'Ringo','last':'Starr','born':1940,'role':'drums'},\n {'first':'George','last':'Martin','born':1926,'role':'producer'}\n ]\n\nIf we want to find all members of The Beatles who's name is \"George\nHarrison\", we could do the following:\n\n::\n\n [item for item in items if item['first']=='George' and item['last']=='Harrison']\n\nWhich is an O(N) operation. If we are only doing it once, it is fine,\nbut if we are doing it multiple times (especially in loops) it can cause\na major bottleneck.\n\nInstead do:\n\n::\n\n from list_dict_DB import list_dict_DB\n DB = list_dict_DB(items) # Will index them all\n\n DB.query(first='George',last='Harrison')\n\nThe creation is O(N) but the query is O(1) and can be done many times.\n\nQueries\n-------\n\nThere are a few different methods to perform queries. It is designed to\nbe flexible and allow for easy construction\n\nBasic Queries\n~~~~~~~~~~~~~\n\nBasic queries only test equality with an ``and`` boolean relationship.\n\nFor example, to query band the example DB for band members with the\nfirst name 'George', you can do either of the following:\n\n::\n\n DB.query(first='George')\n DB.query({'first':'George'})\n DB[{'first':'George'}] # item indecies can be queries or a number\n DB(first='George') # Directly calling the object is a query()\n\nTo get George Harrison, you can do the following:\n\n::\n\n DB.query(first='George',last='Harrison')\n\nOr again, you can use a dictionary or mix and match. For example:\n\n::\n\n DB.query({'first':'George'},last='Harrison')\n\nAgain, you are restricted to equality and AND relationships.\n\nAdvanced Queries\n~~~~~~~~~~~~~~~~\n\nAdvanced queries are a bit more complex. The require a ``Qobj``. Note, a\n``Qobj`` expires if the DB index changes (``update()``, ``remove()``,\n``add()``, ``add_attribute()``, and ``reindex()``)\n\nAn advanced query is constructed as follows. **NOTE**: Python gets\neasily messed up with assignment. Use parentheses to separate\nstatements!\n\nFor example, to query all elements with the first name George and the\nlast name **not** Martin, you can do:\n\n::\n\n Q = DB.Qobj() # Instantiate it with the DB. DB.Q() will also work\n DB.query( (Q.first=='George') & (Q.last != 'Martin') )\n\nOr\n\n::\n\n DB.query( (DB.Q().first=='George') & (DB.Q().last != 'Martin') )\n\nNotice:\n\n- Use of parentheses. The queries must be separated\n- We are checking equality so ``==`` and ``!=`` are used\n\n - You can also negate with ``~`` but again, be careful and\n deliberate about parentheses\n\n- We instantiate the ``Q`` object with the DB. If the DB index is\n changed, the ``Q`` object will not be allowed to run as a precaution.\n- We used ``&`` for ``and`` and ``|`` for ``or``\n- ``<``, ``<=``, ``>``, ``>=``, and filters are supported but these are\n O(N) opperations.\n\nYou can also do more advanced boolean logic such as:\n\n::\n\n DB.query( ~( (Q.role=='guitar') | (Q.role=='drums')))\n\nFilters\n^^^^^^^\n\nA filter allows for more advanced queries of the data but, as noted\nbelow, are O(N) (as with ``<``, ``<=``, ``>``, ``>=``).\n\nFor example, to perform a simple equality, the following return the same\nentry. But do note that the equality version is *much faster*.\n\nEdge Case: If an attribute's name is 'filter', the filter method may be\naccessed through ``_filter``.\n\n::\n\n # Traditional lookup:\n DB.query(Q.first == 'George') # equality is O(1)\n\n # Filter lookup\n filt = lambda item: True if item['first'] == 'George' else False\n DB.query(Q.filter(filt))\n\nThe are flexible for more advanced queries\n\nWARNING about speed\n^^^^^^^^^^^^^^^^^^^\n\nSome of the major speed gains in this are due to the use of dictionaries\nand sets which are O(1) complexity.\n\nQueries with ``<``, ``<=``, ``>``, ``>=``, and ``filters`` are O(N)\nopperations and should be avoided if possible.\n\nThe time complexity of a query will depend on the number of items that\nmatch any part of the query.\n\nLoading and Saving (Dumping)\n----------------------------\n\nThere is *intentionally* no built in way to dump these as they are\nintended to be *in-memory*. Of course, the a good way to save or load is\nas follows:\n\nDump:\n\n::\n\n import json\n with open('DB.json','w') as F:\n json.dump(DB.items(),F)\n\nLoad:\n\n::\n\n from list_dict_DB import list_dict_DB\n import json\n with open('DB.json') as F:\n DB = list_dict_DB(json.load(F))\n\nLists:\n------\n\nAll attributes must be hashable. The only exception are lists in which\ncase the list is expanded for each item. For example, an entry may be:\n\n::\n\n {'first':'George','last':'Harrison','born':1943,'role':['guitar','sitar']}\n\nand\n\n::\n\n DB.query(role='sitar')\n\nwill return him.\n\nBenchmarks & Complexity Testing\n-------------------------------\n\nI compared the creating and querying a large database with the following\nmethods. Note that some cache results so I recreated and re-queried from\nscratch. In practice, even caching the results does not help much if the\nqueries change.\n\n- ``list_dict_DB``\n- simple looping with a *copied* list (*not* ``deepcopy`` though)\n- `Pandas `__ dataframe (0.16.2)\n- `TinyDB `__ (3.2.2) with\n in-memory storage\n- `dataset `__ (0.6.0) with\n slite3 in-memory storage\n\n - dataset is a wrapper to\n `SQLAlchemy `__ that (in my words)\n provides a noSQL interface to SQL.\n\nI tested on my MacBook Pro (Retina, 15-inch, Mid 2014) laptop with 2.8\nGHz i7 and 16 gb of ram using Python 2.7.9.\n\nThe following figure is the time to build and query the resulting data\nobject. Note that for TinyDB, the object was deleted between tests since\nit caches queries\n\n|benchmarks|\n\nFrom the slope of the plots, you can estimate the complexity. I just\ncalculated from the final point. The order is O(N^{slope})\n\n+--------------------+---------------+----------------+\n| Tool | Query slope | Create slope |\n+====================+===============+================+\n| ``list_dict_DB`` | 0.12 | 1.01 |\n+--------------------+---------------+----------------+\n| ``loop_copy`` | 1.12 | 1.27 |\n+--------------------+---------------+----------------+\n| ``pandas`` | 0.92 | 0.99 |\n+--------------------+---------------+----------------+\n| ``TinyDB_mem`` | 1.04 | 1.00 |\n+--------------------+---------------+----------------+\n| ``dataset_mem`` | 0.03 | 1.02 |\n+--------------------+---------------+----------------+\n\n`dataset `__ gives this tool\na run for its money but it also has a lot more dependancies and was the\nslowest in creation time (though, if you use it with a file, once it is\ncreated, you do not have to recreate it again). Pandas also performs\nwell and only starts to have the O(N) dependency creep in at larger\nsizes. Of course, this is a scaling analysis. When you look at actual\nquery times, ``list_dict_DB`` is orders of magnitude faster!\n\nWhich tool is the best will be problem dependent, but these results make\na strong argument for ``list_dict_DB``\n\nKnown Issues\n------------\n\nNone at the moment.\n\nThere is 100% (!!!) test coverage. Of course that doesn't mean there\naren't bugs. If you find any, please report them.\n\nLimitations\n-----------\n\n- The entire DB exists in memory\n- Serializing (dumping) is not included though is easy to do with JSON\n or the like. See above\n- The index used in the dictionary is itself a dictionary with keys as\n any value. Since these are all done as pointers to original list, the\n memory footprint should be small.\n- This has **not** been tested for thread-safety!\n\n.. |benchmarks| image:: benchmark.png\n :target: benchmark.png\n\n\n", "description_content_type": null, "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/Jwink3101/list_dict_db", "keywords": "", "license": "", "maintainer": "", "maintainer_email": "", "name": "list-dict-DB", "package_url": "https://pypi.org/project/list-dict-DB/", "platform": "", "project_url": "https://pypi.org/project/list-dict-DB/", "project_urls": { "Homepage": "https://github.com/Jwink3101/list_dict_db" }, "release_url": "https://pypi.org/project/list-dict-DB/20170911.3/", "requires_dist": null, "requires_python": "", "summary": "in memory database like object with O(1) queries", "version": "20170911.3" }, "last_serial": 3166118, "releases": { "20170909": [ { "comment_text": "", "digests": { "md5": "a7d6ce002f5005c1fc0a74184c01c596", "sha256": "9e1c56179cfffc4601a5494b014a84af355a1b8e8102f80e6fa90db4ab383cd4" }, "downloads": -1, "filename": "list_dict_DB-20170909-py2.py3-none-any.whl", "has_sig": false, "md5_digest": "a7d6ce002f5005c1fc0a74184c01c596", "packagetype": "bdist_wheel", "python_version": "py2.py3", "requires_python": null, "size": 15579, "upload_time": "2017-09-09T18:48:10", "url": "https://files.pythonhosted.org/packages/d0/ce/b9908ae2c9a28b58df8a344db41528bee03a329c3f66aadbbffc9f6bba79/list_dict_DB-20170909-py2.py3-none-any.whl" } ], "20170911": [ { "comment_text": "", "digests": { "md5": "1e347f547c0e1cd8d3ca92962b1f8b04", "sha256": "ae249e6edeca3a5e128119b697a601d0d81e22ac2320fc1f0cbe110aea1b7342" }, "downloads": -1, "filename": "list_dict_DB-20170911-py2-none-any.whl", "has_sig": false, "md5_digest": "1e347f547c0e1cd8d3ca92962b1f8b04", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 15650, "upload_time": "2017-09-11T13:50:31", "url": "https://files.pythonhosted.org/packages/02/74/c948e905d0ffed2e265ac9121988789c209e2ed7a5746adb12ffc1bed772/list_dict_DB-20170911-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "ca05c4257a617cf1e15d764ba23b7364", "sha256": "88b66cc46231b7ca360fc851647268743b98c1604edead8454a552b658347b62" }, "downloads": -1, "filename": "list_dict_DB-20170911.tar.gz", "has_sig": false, "md5_digest": "ca05c4257a617cf1e15d764ba23b7364", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 14168, "upload_time": "2017-09-11T13:50:32", "url": "https://files.pythonhosted.org/packages/5a/68/3c9393342304e5b7e047a513e3f1519ce3a2d0e6fc4576e81474cfad4673/list_dict_DB-20170911.tar.gz" } ], "20170911.2": [ { "comment_text": "", "digests": { "md5": "a92f687c19ac47d9d4c32de69aa0fea7", "sha256": "b8a597c6e6f3eab83693711b9aafda4b2370e3369836edd6bf960673aede7e16" }, "downloads": -1, "filename": "list_dict_DB-20170911.2-py2-none-any.whl", "has_sig": false, "md5_digest": "a92f687c19ac47d9d4c32de69aa0fea7", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 15971, "upload_time": "2017-09-11T19:37:53", "url": "https://files.pythonhosted.org/packages/77/d0/a57a9c37e22dae0d8a43a431889589b48d07569e8bb66fc397a312c31357/list_dict_DB-20170911.2-py2-none-any.whl" } ], "20170911.3": [ { "comment_text": "", "digests": { "md5": "aea1f126410ddfa3b7dd073042539f3a", "sha256": "9321eeff79107d3a67676547444d5be06a796f7f06618396c5a132908a57c579" }, "downloads": -1, "filename": "list_dict_DB-20170911.3-py2-none-any.whl", "has_sig": false, "md5_digest": "aea1f126410ddfa3b7dd073042539f3a", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 15971, "upload_time": "2017-09-11T19:43:15", "url": "https://files.pythonhosted.org/packages/dd/b2/ca3391a2a37fda525a53acd3fc2931c115e7fb54e2ac293f8a280795d894/list_dict_DB-20170911.3-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "d2f9ed62a481b61c829f7976d7ad7c55", "sha256": "f079057cec793f8fd3b60e9bee130b2ac59e12a013ba44133c6c23a23b3a4df9" }, "downloads": -1, "filename": "list_dict_DB-20170911.3.tar.gz", "has_sig": false, "md5_digest": "d2f9ed62a481b61c829f7976d7ad7c55", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 15284, "upload_time": "2017-09-11T19:43:16", "url": "https://files.pythonhosted.org/packages/a9/26/97cf56ef17eb689ee5948ef6687d055215f8559b108b695a5d3d17e65705/list_dict_DB-20170911.3.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "aea1f126410ddfa3b7dd073042539f3a", "sha256": "9321eeff79107d3a67676547444d5be06a796f7f06618396c5a132908a57c579" }, "downloads": -1, "filename": "list_dict_DB-20170911.3-py2-none-any.whl", "has_sig": false, "md5_digest": "aea1f126410ddfa3b7dd073042539f3a", "packagetype": "bdist_wheel", "python_version": "py2", "requires_python": null, "size": 15971, "upload_time": "2017-09-11T19:43:15", "url": "https://files.pythonhosted.org/packages/dd/b2/ca3391a2a37fda525a53acd3fc2931c115e7fb54e2ac293f8a280795d894/list_dict_DB-20170911.3-py2-none-any.whl" }, { "comment_text": "", "digests": { "md5": "d2f9ed62a481b61c829f7976d7ad7c55", "sha256": "f079057cec793f8fd3b60e9bee130b2ac59e12a013ba44133c6c23a23b3a4df9" }, "downloads": -1, "filename": "list_dict_DB-20170911.3.tar.gz", "has_sig": false, "md5_digest": "d2f9ed62a481b61c829f7976d7ad7c55", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 15284, "upload_time": "2017-09-11T19:43:16", "url": "https://files.pythonhosted.org/packages/a9/26/97cf56ef17eb689ee5948ef6687d055215f8559b108b695a5d3d17e65705/list_dict_DB-20170911.3.tar.gz" } ] }