{ "info": { "author": "Mindey", "author_email": "mindey@qq.com", "bugtrack_url": null, "classifiers": [], "description": "# db\nThis is an experimental MongoClient wrapper to explore a database patterns. Requires local [MongoDB](https://www.mongodb.com/) to run.\n\n```\npip install metadb\n```\n\n## Basics\n```python\nfrom metadb import MongoClient\n\ncli = MongoClient()\n\n# Create an app\ncli.create_app('hello')\ncli.apps()\n\n# Create a table\ncli.create_table('hello-world')\ncli.tables()\n\n# Define normalization type\ncli.types['hello-world'].insert(\n {'test': {'*': 'http://corresponding-concept-definition-url|lambda _: _.title()'}})\n\n# Create a record\ncli.create_item({'test': 'hello, world'}, 'hello-world')\n\n# Raw data:\ncli.items['hello-world'].find_one()\n>>> {'_id': ObjectId('5ae156142b93bc3c6f60aa11'),\n '_type': ObjectId('5ae156112b93bc3c6f60aa10'),\n 'test': 'hello, world'}\n\n# Normalized data:\ncli.index['hello-world'].find_one()\n>>> {'_id': ObjectId('5ae156142b93bc3c6f60aa11'),\n '_type': ObjectId('5ae156112b93bc3c6f60aa10'),\n 'https-www-wikidata-org-wiki-q45594': 'Hello, World'}\n\ncli.drop_app('hello')\n```\n\n\n# General ideas\n1. Let's say we have MongoDB (or any kind of key-value store).\n\n2. Let's say we have databases:\n\n- `items`\n- `types`\n- `index`\n\n3. Let's say we have a constraint for uniqueness of names in reserved collection `db[\"types\"][\"_names\"]` representing global namespace for apps registry:\n```\nuse types\n\ndb.createCollection(\"_names\", {\n validator: {\n $and: [\n {\n \"name\": {$type: \"string\", $exists: true}\n }\n ]\n }\n})\n\ndb.getCollection('_names').createIndex( {\"name\": 1}, { unique: true } )\n```\n\n4. Let's say we have a convention, that every time we create a new colletion in `items` database, each each record has to have `._type` attribute, which is an object ID from mongodb.\n\nFor example:\n```\nuse items\n\ndb.createCollection(\"app-1.0/model_name\", {\n validator: {\n $and: [\n {\n \"_type\": {$type: \"objectId\", $exists: true}\n }\n ]\n }\n})\n```\n\n5. Assume that the `._type` is like a foreign key reference to the corresponding type, stored in the `types` database in the collection with corresponding name.\n\n```\nuse types\n\ntypes[\"app-1.0/model_name\"]\n```\n\n**Corollary:** This way, each item refers to its type, and types can have histories, rather than migrations of models.\n\nE.g., whenever we change the schema for model, we can simply insert new instancde into `types` database, corresponding to new schema.\n\n5.1. Assume that `db['types']['_types']` stores unique registered type names.\n\n\n6. Assume each instance in any of collections of the `types` database, follows a pattern corresponding to the pattern of the instance in the `items` database, one level up the instance tree hierarchy, as so:\n\n```\nexample:\n\n{'a': {'b': {'c': 1}}}\n\npattern:\n{'*': 'obj', 'a': {'b': {'c': {'*': 'int'}}}}\n\n(Here, the asterisk keys specify information about records at each level, e.g., 'c' are records of 'int'.)\n```\n\nMeaning that, based on the example record provided, data may have a \"mask\" that specifies the value types:\n\n```python\nfrom boltons.iterutils import remap\n\nRESERVED_SCHEMA_KEY = '*'\n\ndef generate_pattern(instance):\n \"\"\"\n Generates pattern based on first record in data.\n \"\"\"\n\n def parse_type(value):\n return type(value).__name__\n\n if isinstance(instance, list):\n if not instance:\n print(\"If Instnace is a list, it has to be non-empty.\")\n return\n elif not isinstance(instance[0], dict):\n print(\"If Instnace is a list, it's first element has to be a dict.\")\n return\n else:\n if isinstance(instance, dict):\n instance = [instance]\n else:\n print(\"Instance must be a dict, or a list of dict.\")\n return\n\n def visit(path, key, value):\n if not any([isinstance(value, t) for t in [dict, list, tuple]]):\n return key, {RESERVED_SCHEMA_KEY: parse_type(value)}\n else:\n return key, value\n\n remapped = remap(instance, visit=visit)\n remapped[0][RESERVED_SCHEMA_KEY] = 'obj'\n return remapped[0]\n\ngenerate_pattern({'a': {'b': 'c'}})\n```\n\nSuppose we replace and enrich such mask under asterisks, with information about ontological types by providing references (e.g., URLs) to ontological vocabularies, and data type normalization rules in tuples separated by `|` like `ontological type|conversion rules`, like so:\n\n```\n{'*': 'http://www.omegawiki.org/DefinedMeaning:377726|lambda x: x', 'a': {'b': {'*': \"https://www.wikidata.org/wiki/Q11573|lambda x.replace(',', '')\"}}}\n```\n\n*Note: for lack of MongoDBs ability to store pure URLs as keys, will store correct urls in the database `db['types']['_terms']`, with required uniqueness for 'name', as so:\n\n```\nuse types\n\ndb.createCollection(\"_terms\", {\n validator: {\n $and: [\n {\n \"name\": {$type: \"string\", $exists: true}\n }\n ]\n }\n})\n\ndb.getCollection('_names').createIndex( {\"name\": 1}, { unique: true } )\n```\n\n\n**Corollary:** Every time we create a new type, we can create corresponding corresponding conversion rules and ontological-metadata, and use it to normalize data instances (hereafter - *normalized instances*), when writing them to `index` database where key is replaced with the urls from the mask, and the values are replaced by the digests of the conversion rules.\n\n7. Let's store the normalized instances in the `index` database.\n\n**Corollary:** We have a generic way to query datasets by concepts, and present their keys in arbitrary human languages for understanding and analytics.\n\n**Corollary:** We can define a generalized SQL, which works for querying knowledge-bases by concepts like:\n\n`SELECT * FROM (concept_id)`\n\nE.g.:\n\n`SELECT * FROM `, `SELECT * FROM `, `SELECT * FROM `\n\n*Note:* it seems like something similar to [SPARQL](https://en.wikipedia.org/wiki/SPARQL), but for generalized concepts from all possible vocabularies, rather than one.\n\n**Corollary:** The database thus constructed has properties:\n\n```\n...\n```\n\nI guess, that this kind of database would be convenient for creating generic scalable backends for a generic application, with both structured and unstructured data accessible for analytics through a unified interface.\n\n\nExamples:\n\n```\nnames = ['infinity']\n\ncollections = [\n 'infinity-0.1/comments'\n 'infinity-0.1/topics'\n]\n\n...\n```\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://gitlab.com/wefindx/metadb", "keywords": "", "license": "ASK FOR PERMISSIONS", "maintainer": "", "maintainer_email": "", "name": "metadb", "package_url": "https://pypi.org/project/metadb/", "platform": "", "project_url": "https://pypi.org/project/metadb/", "project_urls": { "Homepage": "https://gitlab.com/wefindx/metadb" }, "release_url": "https://pypi.org/project/metadb/0.0.9/", "requires_dist": null, "requires_python": "", "summary": "A pattern of types theory to mongodb.", "version": "0.0.9" }, "last_serial": 4083301, "releases": { "0.0.1": [ { "comment_text": "", "digests": { "md5": "c20eeee9c67b4ca2488d5cf172c7e5c8", "sha256": "c20a0f41817487ef342246eb4c2fc023c8aef925d69706102bb82016fb1bbfff" }, "downloads": -1, "filename": "metadb-0.0.1.tar.gz", "has_sig": false, "md5_digest": "c20eeee9c67b4ca2488d5cf172c7e5c8", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8440, "upload_time": "2018-06-23T23:23:33", "url": "https://files.pythonhosted.org/packages/c5/90/50071a26f70de7ef31f04673f7df3a1497fb4e45f5ab1c14e8bb043d540f/metadb-0.0.1.tar.gz" } ], "0.0.2": [ { "comment_text": "", "digests": { "md5": "05c71c6a6f461f4602a1f58b73c818d6", "sha256": "c4da5a6725a2f159f6c8819a50341713e3b74bad4acbe6e550e12454229ece1e" }, "downloads": -1, "filename": "metadb-0.0.2.tar.gz", "has_sig": false, "md5_digest": "05c71c6a6f461f4602a1f58b73c818d6", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8662, "upload_time": "2018-07-05T08:05:46", "url": "https://files.pythonhosted.org/packages/cd/f8/7b765bb39fc7a308158bfff651fa5974b3ac54007903ecf2291597f9789d/metadb-0.0.2.tar.gz" } ], "0.0.3": [ { "comment_text": "", "digests": { "md5": "770268e428c48cb73547f91f175d5dcc", "sha256": "a2adc8f0934e2bacb1b34ffe3912c6f396bdbcaac62bca203bb65b40f2e6a83d" }, "downloads": -1, "filename": "metadb-0.0.3.tar.gz", "has_sig": false, "md5_digest": "770268e428c48cb73547f91f175d5dcc", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8663, "upload_time": "2018-07-06T00:41:19", "url": "https://files.pythonhosted.org/packages/82/13/522b2e187c16d70a293d773e3bc9cbb4108dae4170b63f3977cff20d6030/metadb-0.0.3.tar.gz" } ], "0.0.4": [ { "comment_text": "", "digests": { "md5": "f6daa8f533b2a94b4707809ad5eb9532", "sha256": "4f567e57dbd2bf69b6208add09b119b1b4b4811cb236979cd468fac16f899c92" }, "downloads": -1, "filename": "metadb-0.0.4.tar.gz", "has_sig": false, "md5_digest": "f6daa8f533b2a94b4707809ad5eb9532", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 8957, "upload_time": "2018-07-06T06:24:22", "url": "https://files.pythonhosted.org/packages/ab/62/7634559fd3b1eca2a7797ace50e45282f0ecd7f7e6ba81787bfff81d13bc/metadb-0.0.4.tar.gz" } ], "0.0.5": [ { "comment_text": "", "digests": { "md5": "eba8ea885c31bd3b58f57b9d38bc94f5", "sha256": "8041e4f75785f5352ee6c8472d0b283483cb4cfe20240554da344a10ced6e5bb" }, "downloads": -1, "filename": "metadb-0.0.5.tar.gz", "has_sig": false, "md5_digest": "eba8ea885c31bd3b58f57b9d38bc94f5", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9031, "upload_time": "2018-07-08T03:35:41", "url": "https://files.pythonhosted.org/packages/d7/5c/f796ac03232be71b6c0c664364acfe22e9f3542106fa284f6f732235da26/metadb-0.0.5.tar.gz" } ], "0.0.6": [ { "comment_text": "", "digests": { "md5": "1f09209170a6c6f106a307bcd7ff5d25", "sha256": "4f33a9d49e966caf7a52d56cabb5119ba866801529ec5b2209a10702f514a331" }, "downloads": -1, "filename": "metadb-0.0.6.tar.gz", "has_sig": false, "md5_digest": "1f09209170a6c6f106a307bcd7ff5d25", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9696, "upload_time": "2018-07-08T03:50:35", "url": "https://files.pythonhosted.org/packages/43/e0/4404bf1b9ab832b567eedd2b14cd6cd358a622043cbfe164eb63af5af334/metadb-0.0.6.tar.gz" } ], "0.0.7": [ { "comment_text": "", "digests": { "md5": "0836323caf780fd6009abcf90ef9bb9a", "sha256": "8f80f601ab8e472956c7add6ce1f1e5664b03214459dd990edf8b4dbaf56309f" }, "downloads": -1, "filename": "metadb-0.0.7.tar.gz", "has_sig": false, "md5_digest": "0836323caf780fd6009abcf90ef9bb9a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9866, "upload_time": "2018-07-09T03:34:59", "url": "https://files.pythonhosted.org/packages/34/17/f0409cdb0082b63073f11011f3f28c68adfb190570ef25436487d35b0100/metadb-0.0.7.tar.gz" } ], "0.0.8": [ { "comment_text": "", "digests": { "md5": "1b0c33fa5a5874c292dc7ff144c0d95d", "sha256": "91c4316e20efb3749524fd6742405221d5c8cc32b9b4afdcd06507a87410cc41" }, "downloads": -1, "filename": "metadb-0.0.8.tar.gz", "has_sig": false, "md5_digest": "1b0c33fa5a5874c292dc7ff144c0d95d", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9956, "upload_time": "2018-07-10T01:35:43", "url": "https://files.pythonhosted.org/packages/06/c0/4165c68981c34912ad30998aeabc594293c9e53206bd09b5d16293d843cc/metadb-0.0.8.tar.gz" } ], "0.0.9": [ { "comment_text": "", "digests": { "md5": "552a1fa3b49bc70b55a9d5fa74e8fae2", "sha256": "67fd3f49d9759cd665e1317e2459aaa9fc282210fac13750804484207fe638aa" }, "downloads": -1, "filename": "metadb-0.0.9.tar.gz", "has_sig": false, "md5_digest": "552a1fa3b49bc70b55a9d5fa74e8fae2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9937, "upload_time": "2018-07-19T18:36:48", "url": "https://files.pythonhosted.org/packages/58/ab/01adb788ad0273ac2e0e74f9befa4885f9b5a63d4616ecee95ce066a670d/metadb-0.0.9.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "552a1fa3b49bc70b55a9d5fa74e8fae2", "sha256": "67fd3f49d9759cd665e1317e2459aaa9fc282210fac13750804484207fe638aa" }, "downloads": -1, "filename": "metadb-0.0.9.tar.gz", "has_sig": false, "md5_digest": "552a1fa3b49bc70b55a9d5fa74e8fae2", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 9937, "upload_time": "2018-07-19T18:36:48", "url": "https://files.pythonhosted.org/packages/58/ab/01adb788ad0273ac2e0e74f9befa4885f9b5a63d4616ecee95ce066a670d/metadb-0.0.9.tar.gz" } ] }