{ "info": { "author": "", "author_email": "", "bugtrack_url": null, "classifiers": [], "description": "Udb\n===\n\n.. image:: https://travis-ci.org/akaterra/udb.py.svg?branch=master\n\nUdb is an in-memory database based on the `Zope Foundation BTrees `_, the `Rtree `_ and on the native python's dict.\nUdb provides indexes support and MongoDB-like queries.\nUdb does not support any type of transactions for now.\n\nTable of contents\n-----------------\n\n* `Requirements <#requirements>`_\n\n* `Installation <#installation>`_\n\n* `Quick start <#quick-start>`_\n\n* `Data schema for default values <#data-schema-for-default-values>`_\n\n * `Functional fields <#functional-fields>`_\n\n* `Indexes <#indexes>`_\n\n * `Index declaration <#index-declaration>`_\n\n * `Scan operations <#scan-operations>`_\n\n * `Float precision <#float-precision>`_\n\n* `Querying <#querying>`_\n\n * `Query validation <#query-validation>`_\n\n * `Comparison order <#comparison-order>`_\n\n* `Storages <#storages>`_\n\n* `Select operation <#select-operation>`_\n\n* `Delete operation <#delete-operation>`_\n\n* `Insert operation <#insert-operation>`_\n\n* `Update operation <#update-operation>`_\n\n* `Running tests <#running-tests-with-pytest>`_\n\n* `Benchmarks <#benchmarks>`_\n\nRequirements\n------------\n\nPython 2.7, Python 3.6\n\nInstallation\n------------\n\n.. code:: bash\n\n pip install udb\n\nTo enable BTree indexes support install `Zope Foundation BTrees `_ package:\n\n.. code:: bash\n\n pip install BTrees\n\nTo enable RTree indexes support install `Rtree `_ package (requires `libspatialindex `_):\n\n.. code:: bash\n\n pip install Rtree\n\nQuick start\n-----------\n\nCreate the Udb instance with the indexes declaration:\n\n.. code:: python\n\n from udb import Udb, UdbBtreeIndex\n\n db = Udb({\n 'a': UdbBtreeIndex(['a']),\n 'b': UdbBtreeIndex(['b']),\n 'cde': UdbBtreeIndex(['c', 'd', 'e']),\n })\n\nInsert records:\n\n.. code:: python\n\n db.insert({'a': 1, 'b': 1, 'c': 3, 'd': 4, 'e': 5})\n db.insert({'a': 2, 'b': 2, 'c': 3, 'd': 4, 'e': 5})\n db.insert({'a': 3, 'b': 3, 'c': 3, 'd': 4, 'e': 5})\n db.insert({'a': 4, 'b': 4, 'c': 3, 'd': 4, 'e': 6})\n db.insert({'a': 5, 'b': 5, 'c': 3, 'd': 4, 'e': 7})\n\nSelect records:\n\n.. code:: python\n\n a = list(db.select({'a': 1})\n\n [{'a': 1}]\n\n b = list(db.select({'b': 0})\n\n [] # no records with b=0\n\n c = list(db.select({'c': 3, 'd': 4})\n\n [{'c': 3, 'd': 4, 'e': 5}, {'c': 3, 'd': 4, 'e': 6}, {'c': 3, 'd': 4, 'e': 7}]\n\nData schema for default values\n------------------------------\n\nData schema allows to fill the inserted record with default values.\nThe default value can be defined as a primitive value or callable:\n\n.. code:: python\n\n from udb import Udb\n\n db = Udb(schema={\n 'a': 'a',\n 'b': 'b',\n 'c': lambda key, record: 'b' if record['b'] == 'b' else 'c',\n })\n\nFunctional fields\n~~~~~~~~~~~~~~~~~\n\n**auto_id** - generates unique id (uuid v1 by default)\n\n.. code:: python\n\n from udb import Udb, auto_id\n\n db = Udb(schema={\n 'id': auto_id(),\n })\n\n**current_timestamp** - uses current timestamp (as int value)\n\n.. code:: python\n\n from udb import Udb, current_timestamp\n\n db = Udb(schema={\n 'timestamp': current_timestamp(),\n })\n\n**fn** - calls custom function\n\n.. code:: python\n\n from udb import Udb, fn\n\n db = Udb(schema={\n 'timestamp': fn(lambda key, record: record['a'] + record['b']),\n })\n\n**optional** - returns \"None\" value\n\n.. code:: python\n\n from udb import Udb, optional\n\n db = Udb(schema={\n 'a': optional,\n })\n\nIndexes\n-------\n\nTo speed up the search for records, the necessary fields can be indexed.\nThe Udb also includes a simple query optimiser that can select the most appropriate index.\n\nBTree indexes:\n\n* **UdbBtreeMultivaluedIndex** - btree based multivalued index supporting multiple records with the same index key.\n\n* **UdbBtreeMultivaluedEmbeddedIndex** - same as the **UdbBtreeMultivaluedIndex**, but supports embedded list of values.\n\n* **UdbBtreeUniqIndex** - btree based index operating with always single records, but the second record insertion with the same index key will raise IndexConstraintError.\n\n* **UdbBtreeIndex** - btree based index operating with always single records, so that the second record insertion with the same index key will overwrite the old one. Can be used when the inserting record definitely generates a unique index key.\n\nHash indexes:\n\n* **UdbHashMultivaluedIndex** - hash based multivalued index supporting multiple records with the same index key.\n\n* **UdbHashMultivaluedEmbeddedIndex** - same as the **UdbHashMultivaluedIndex**, but supports embedded list of values.\n\n* **UdbHashUniqIndex** - hash based index operating with always single records, but the second record insertion with the same index key will raise IndexConstraintError.\n\n* **UdbHashIndex** - hash based index operating with always single records, so that the second record insertion with the same index key will overwrite the old one. Can be used when the inserting record definitely generates a unique index key.\n\nSpatial indexes:\n\n* **UdbRtreeIndex** - spatial index that supports \"intersection with rectangle\" and \"near to point\" search\n\nIndex declaration\n~~~~~~~~~~~~~~~~~\n\nAs it was shown `above <#quick-start>`_, for the index declaration the Udb instance should be created with the **indexes** parameter that provides dict with the key as an index name and value as an index instance.\nThe index instance should be created with the sequence of fields (1 at least) which will be fetched in the declared order from the indexed record.\nBy this sequence of fields, the index key will be generated and will be associated with the indexed record.\n\n.. code:: python\n\n from udb import Udb, UdbBtreeIndex\n\n db = Udb(indexes={\n 'abc': UdbBtreeIndex(['a', 'b', 'c']) # \"a\", \"b\" and \"c\" fields will be fetched from the indexed record\n })\n\n record = {'a': 'A', 'b': 'B', 'c': 'C'} # index key=ABC\n\nIn this case of declaration in order that the record to be indexed it must contain all of the fields declared in the sequence of index fields.\n\n.. code:: python\n\n from udb import Udb, UdbBtreeIndex\n\n db = Udb(indexes={\n 'abc': UdbBtreeIndex(['a', 'b', 'c']) # \"a\", \"b\" and \"c\" fields will be fetched from the indexed record\n })\n\n record = {'a': 'A', 'b': 'B'} # won't be indexed, raises FieldRequiredError\n\nOr using dictionary in case of Python3:\n\n.. code:: python\n\n from udb import Udb, UdbBtreeIndex, required\n\n db = Udb(indexes={\n 'abc': UdbBtreeIndex({'a': required, 'b': required, 'c': required}) # \"a\", \"b\" and \"c\" fields will be fetched from the indexed record\n })\n\n record = {'a': 'A', 'b': 'B'} # won't be indexed, raises FieldRequiredError\n\nOr using list ot tuples in case of Python2 (to save order of keys):\n\n.. code:: python\n\n from udb import Udb, UdbBtreeIndex, required\n\n db = Udb(indexes={\n 'abc': UdbBtreeIndex([('a', required), ('b', required), ('c', required)]) # \"a\", \"b\" and \"c\" fields will be fetched from the indexed record\n })\n\n record = {'a': 'A', 'b': 'B'} # won't be indexed, raises FieldRequiredError\n\nThe default value for missing field can be defined as a primitive value or callable:\n\n.. code:: python\n\n from udb import Udb, UdbBtreeIndex\n\n db = Udb(indexes={\n 'abc': UdbBtreeIndex({'a': 'a', 'b': 'b', 'c': 'c'})\n })\n\n record = {'a': 'A', 'c': 'C'} # index key=AbC\n\n.. code:: python\n\n from udb import Udb, UdbBtreeIndex\n\n db = Udb(indexes={\n 'abc': UdbBtreeIndex({'a': 'a', 'b': lambda values: 'b', 'c': 'c'})\n })\n\n record = {'a': 'A', 'c': 'C'} # index key=AbC\n\nScan operations\n~~~~~~~~~~~~~~~\n\n* **const** - an index covers only one record by the index key\n\n* **in** - an index covers multiple records by the list of the index keys, each of which covers exactly one record\n\n* **range** - an index covers multiple records by the index keys set by the minimum and maximum values\n\n* **prefix** - an index covers range of record by the partial index key\n\n* **prefix_in** - an index covers multiple records by the list of the partial index keys, each of which covers range of records\n\n* **intersection** - an index covers records intersected by the rectangle\n\n* **near** - an index covers records near to the point\n\n* **seq** - scanning that is not covered by any index, all records will be scanned\n\nFloat precision\n~~~~~~~~~~~~~~~\n\nTo be able to index float values enable the float mode with necessary precision (number of decimals):\n\n.. code:: python\n\n from udb import Udb, UdbBtreeIndex\n\n db = Udb(indexes={\n 'abc': UdbBtreeIndex(['a']).set_float_precision(10)\n })\n\n db.insert({'a': 3.1415926525})\n\nQuerying\n--------\n\nUdb supports limited MongoDB-like queries which can be used in the delete, select or update operations.\nThe query generally is a python's dict with the key as a field and value as a primitive value or an equality condition over the field.\nThe query dict is **mutable**, therefore it needs to be initialized every time anew.\n\nSupported query operations:\n\n* **$eq** - equal to a value\n\n .. code:: python\n\n udb.select({'a': {'$eq': 5}})\n\n* **$gt** - greater then value\n\n .. code:: python\n\n udb.select({'a': {'$gt': 5}})\n\n* **$gte** - greater or equals to a value\n\n .. code:: python\n\n udb.select({'a': {'$gte': 5}})\n\n* **$in** - equal to an any value in the list of a values\n\n .. code:: python\n\n udb.select({'a': {'$in': 5}})\n\n* **$intersection** - intersection with rectangle\n\n .. code:: python\n\n udb.select({'a': {'$intersection': {'minX': 5, 'minY': 5, 'maxX': 1, 'maxY': 5}}})\n\n* **$lt** - less then value\n\n .. code:: python\n\n udb.select({'a': {'$lt': 5}})\n\n* **$lte** - less or equal to a value\n\n .. code:: python\n\n udb.select({'a': {'$lte': 5}})\n\n* **$ne** - not equal to a value\n\n .. code:: python\n\n udb.select({'a': {'$ne': 5}})\n\n * performs \"seq\" scan.\n\n* **$near** - near to point with optional min and max distances\n\n .. code:: python\n\n udb.select({'a': {'$near': {'x': 5, 'y': 5, 'minDistance': 1, 'maxDistance': 5}}})\n\n * allocates sort buffer is case of \"seq\" scan\n\n * selects all records in case of unset *maxDistance* and set *minDistance*.\n\n* **$nin** - not equal to an any value in the list of a values\n\n .. code:: python\n\n udb.select({'a': {'$nin': [1, 2, 3]}})\n\n * performs \"seq\" scan.\n\n* **primitive value** - equal to a value\n\n .. code:: python\n\n udb.select({'a': 5})\n\nExample:\n\n.. code:: python\n\n records = list(udb.select({'a': 1}))\n records = list(udb.select({'a': {'$gte': 1, '$lte': 3}}))\n records = list(udb.select({'a': {'$in': [1, 2, 3], '$lte': 2}}))\n\nQuery validation\n~~~~~~~~~~~~~~~~\n\nBy default Udb does not check the query dict validity.\nTo check its validity use **validate_query** method.\n\n.. code:: python\n\n udb.validate_query({'a': {'$gte': [1, 2, 3]}}) # raises InvalidScanOperationValueError('a.$gte')\n\nComparison order\n~~~~~~~~~~~~~~~~\n\nDue to the fact that the Udb database is not strictly typed for indexed values, there is the following order of ascending comparisons for values \u200b\u200bof different types:\n\n* None\n\n* boolean - *false* less then *true*\n\n* int, float\n\n* string\n\nSo, for example, the record containing *int* value always greater than the record containing *boolean* value for the same field.\nAlso, it means, that the records having indexed field will be fetched in the provided order.\n\nStorages\n--------\n\nThe storage allows to keep data persistent.\n\n**UdbJsonFileStorage** stores data in the JSON file.\n\n.. code:: python\n\n from udb import UdbJsonFileStorage\n\n db = Udb(storage=UdbJsonFileStorage('db'))\n\n db.load_db()\n\n db.insert({'a': 'a'})\n\n db.save_db()\n\n**UdbWalStorage** stores data of delete, insert and update operations in the WAL (Write-Ahead-Logging) file chronologically.\n\n.. code:: python\n\n from udb import UdbWalStorage\n\n db = Udb(storage=UdbWalStorage('db'))\n\n db.load_db()\n\n db.insert({'a': 'a'})\n\n db.save_db() # does nothing; delete, insert and update data will be stored on the fly\n\nSelect operation\n----------------\n\nSelected records are **mutable**, so avoid to update them directly.\nOtherwise use copy on select mode:\n\n.. code:: python\n\n udb.set_copy_on_select()\n\nTo limit the result subset to particular number of records use **limit** parameter:\n\n.. code:: python\n\n records = list(udb.select({'a': 1}, limit=5)\n\nTo fetch the result subset from the particular offset use **offset** parameter:\n\n.. code:: python\n\n records = list(udb.select({'a': 1}, offset=5)\n\nDelete operation\n----------------\n\n.. code:: python\n\n udb.delete(q={'a': 1}, offset=5)\n\nInsert operation\n----------------\n\n.. code:: python\n\n udb.insert({'a': 1})\n\nUpdate operation\n----------------\n\n.. code:: python\n\n udb.update({'a': 2}, q={'a': 1}, offset=5)\n\nRunning tests with pytest\n-------------------------\n\n.. code:: bash\n\n pytest . --ignore=virtualenv -v\n\nBenchmarks\n----------\n\n* Intel Core i7, 3.58 GHz, 4 cores, disabled HT\n\n* 16GB 1600 MHz RAM\n\n* PyPy3\n\n.. code:: text\n\n INSERT (BTREE, 1ST INDEX COVERS 1 FIELD)\n\n Total time: 2.9712460041046143 sec., per sample: 2.971246004104614e-06 sec., samples per second: 336559.1400437912, total samples: 1000000\n\n SELECT (BTREE, 1ST INDEX COVERS 1 FIELD)\n\n Total time: 1.7301840782165527 sec., per sample: 1.7301840782165527e-06 sec., samples per second: 577973.1836573046, total samples: 1000000\n\n INSERT (BTREE, 1ST INDEX COVERS 1 FIELD, 2ND INDEX COVERS 1 FIELD, 3RD INDEX COVERS 2 FIELDS)\n\n Total time: 6.8810200691223145 sec., per sample: 6.881020069122315e-06 sec., samples per second: 145327.29013353275, total samples: 1000000\n\n SELECT (BTREE, 1ST INDEX COVERS 1 FIELD, 2ND INDEX COVERS 1 FIELD, 3RD INDEX COVERS 2 FIELDS)\n\n Total time: 1.8345210552215576 sec., per sample: 1.8345210552215576e-06 sec., samples per second: 545101.4024361953, total samples: 1000000\n\n INSERT (HASH, 1ST INDEX COVERS 1 FIELD)\n\n Total time: 1.781458854675293 sec., per sample: 1.781458854675293e-06 sec., samples per second: 561337.6909467103, total samples: 1000000\n\n SELECT (HASH, 1ST INDEX COVERS 1 FIELD)\n\n Total time: 0.8209011554718018 sec., per sample: 8.209011554718018e-07 sec., samples per second: 1218173.458929125, total samples: 1000000\n\n INSERT (HASH, 1ST INDEX COVERS 1 FIELD, 2ND INDEX COVERS 1 FIELD, 3RD INDEX COVERS 2 FIELDS)\n\n Total time: 4.138401985168457 sec., per sample: 4.138401985168457e-06 sec., samples per second: 241639.16496847855, total samples: 1000000\n\n SELECT (HASH, 1ST INDEX COVERS 1 FIELD, 2ND INDEX COVERS 1 FIELD, 3RD INDEX COVERS 2 FIELDS)\n\n Total time: 1.001291036605835 sec., per sample: 1.001291036605835e-06 sec., samples per second: 998710.628020589, total samples: 1000000\n\n INSERT (RTREE, 1ST INDEX COVERS 1 FIELD)\n\n Total time: 9.943094968795776 sec., per sample: 9.943094968795777e-05 sec., samples per second: 10057.230702696503, total samples: 100000\n\n SELECT (RTREE, 1ST INDEX COVERS 1 FIELD, LIMIT = 5)\n\n Total time: 11.716284990310669 sec., per sample: 0.00011716284990310669 sec., samples per second: 8535.128676256994, total samples: 100000\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/akaterra/udb.py", "keywords": "", "license": "MIT", "maintainer": "", "maintainer_email": "", "name": "udb-py", "package_url": "https://pypi.org/project/udb-py/", "platform": "", "project_url": "https://pypi.org/project/udb-py/", "project_urls": { "Homepage": "https://github.com/akaterra/udb.py" }, "release_url": "https://pypi.org/project/udb-py/0.1.0/", "requires_dist": null, "requires_python": "", "summary": "Lightweight in-memory database", "version": "0.1.0" }, "last_serial": 5900271, "releases": { "0.1.0": [ { "comment_text": "", "digests": { "md5": "09d1a2d13b3cbd5653e0c005ee7c82da", "sha256": "d385dc227d814d02b95b04626ef337c69bd43991c60f6c00dec5f989d44d3cc5" }, "downloads": -1, "filename": "udb_py-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "09d1a2d13b3cbd5653e0c005ee7c82da", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 17097, "upload_time": "2019-09-28T18:13:15", "url": "https://files.pythonhosted.org/packages/86/24/c9da60f23e8d4b962f86ea49d68da0d8dc9a89ded0c4764c2358a74c1b46/udb_py-0.1.0-py3-none-any.whl" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "09d1a2d13b3cbd5653e0c005ee7c82da", "sha256": "d385dc227d814d02b95b04626ef337c69bd43991c60f6c00dec5f989d44d3cc5" }, "downloads": -1, "filename": "udb_py-0.1.0-py3-none-any.whl", "has_sig": false, "md5_digest": "09d1a2d13b3cbd5653e0c005ee7c82da", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 17097, "upload_time": "2019-09-28T18:13:15", "url": "https://files.pythonhosted.org/packages/86/24/c9da60f23e8d4b962f86ea49d68da0d8dc9a89ded0c4764c2358a74c1b46/udb_py-0.1.0-py3-none-any.whl" } ] }