{ "info": { "author": "Konrad Wojas", "author_email": "konrad@wojas.nl", "bugtrack_url": null, "classifiers": [ "Framework :: Django", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Topic :: Database" ], "description": "What is it?\r\n===========\r\n\r\nDjango-denormalize allows you to convert a tree of Django ORM objects into one\r\ndata document. With 'data document' we mean a structure of dicts, lists and\r\nother primitive types, that can be serialized to JSON or a Python Pickle.\r\n\r\nThe resulting document can be used in combination with the Django cache layer\r\nto create blazingly fast views that do not hit the database. The data can also\r\nbe synced to a NoSQL store like MongoDB_, for consumption by other frameworks,\r\nlike Meteor_ (NodeJS_ based).\r\n\r\nIf any data changes in the ORM (even if it's on a some deep many-to-many\r\nrelationship far away from the root object), django-denormalize will\r\nautomatically trigger a cache invalidation of the root object's document\r\nand/or sync the new document to your preferred NoSQL store.\r\n\r\nThis module also includes special support for content in FeinCMS_ objects: all\r\nregions and content types will be available under a 'content' dictionary.\r\n\r\n\r\nExample\r\n=======\r\n\r\nFor example, suppose you have the following models:\r\n\r\n.. sourcecode:: python\r\n\r\n class Book(models.Model):\r\n title = models.CharField(_(\"title\"), max_length=80)\r\n year = models.PositiveIntegerField(_(\"year\"), null=True)\r\n authors = models.ManyToManyField(Author)\r\n ...\r\n\r\n class Author(models.Model):\r\n name = models.CharField(_(\"name\"), max_length=80)\r\n ...\r\n\r\n\r\nYou can write the following class to describe your document collection:\r\n\r\n.. sourcecode:: python\r\n\r\n from denormalize.models import DocumentCollection\r\n\r\n class BookCollection(DocumentCollection):\r\n model = Book\r\n name = \"books\"\r\n prefetch_related = ['authors']\r\n\r\n\r\nLet's print all documents:\r\n\r\n.. sourcecode:: python\r\n\r\n books = BookCollection()\r\n for doc in books.dump_collection():\r\n print doc\r\n\r\n\r\nEach document will have the following structure:\r\n\r\n.. sourcecode:: python\r\n\r\n {\r\n 'id': 42,\r\n 'title': u'Cooking for Geeks',\r\n 'year': 2010,\r\n 'authors': [\r\n {\r\n 'id': 18,\r\n 'name': u'Jeff Potter',\r\n ...\r\n }\r\n ],\r\n ...\r\n }\r\n\r\n\r\nThis in itself can be useful, but the real power of django-documentsync lies\r\nin its backends. Suppose we want to cache these documents, to avoid hitting\r\nthe database. We can use these documents in our views, instead of accessing\r\nthe Django ORM. Backend and view code:\r\n\r\n.. sourcecode:: python\r\n\r\n # In models.py\r\n\r\n from denormalize.backends.cache import CacheBackend\r\n\r\n backend = CacheBackend()\r\n backend.register(books)\r\n\r\n # In views.py\r\n\r\n def our_book_view(request, book_id):\r\n book_doc = backend.get_doc(books, book_id)\r\n if not book_doc:\r\n raise Http404(\"Book not found\")\r\n return render(request, 'book.html', {'book': book_doc})\r\n\r\n\r\nOur `CacheBackend` will try to fetch the book document from the Django cache.\r\nIf it cannot be found, it will generate the document from the ORM and then\r\nstore it in the cache.\r\n\r\nAnd best of all: if any data on the Author or Book objects for this book\r\nchanges, the cache will automatically be invalidated for us! The `book_doc`\r\nwe retrieve, will always be up to date.\r\n\r\n\r\nHow does this compare with simply using the Django page cache?\r\n--------------------------------------------------------------\r\n\r\nThe traditional approach to Django scalability is using the page cache to\r\ncache the entire page rendered by the view. This works quite well, but it has\r\ntwo big disadvantages:\r\n\r\n* The cache will not automatically be invalidated as soon as the underlying\r\n data changes. If you set the page cache time to 60 seconds, it will take\r\n up to 60 seconds for a change to be visible on the site.\r\n* This approach does not work well for websites where users can login and\r\n see customized content.\r\n\r\nIn simpler cases, these problems can be worked around by using template\r\nfragment caching, as this allows you to cache common regions, and specify\r\nwhich variables should be incorporated into the cache key. But even in our\r\nsimple Book example, it's not easy to invalidate the cache on changes to Author.\r\n\r\nThe disadvantages of the django-denormalize approach are:\r\n\r\n* You no longer have access to the Django models and its methods in your\r\n templates. You are dealing with the raw data. Of course, you can add any\r\n extra information you might need in the template by extending the\r\n `DocumentCollection`, or by creating custom template filters to calculate\r\n some value.\r\n* Writes by the ORM to models that are included in documents are slower,\r\n because they are monitored for changes.\r\n\r\n\r\nMongoDB backend\r\n===============\r\n\r\nThe MongoDB_ backend works quite similar to the `CacheBackend`:\r\n\r\n.. sourcecode:: python\r\n\r\n # In models.py\r\n\r\n from denormalize.backends.mongodb import MongoBackend\r\n\r\n backend = MongoBackend(\r\n name='mongo',\r\n db_name='test_denormalize',\r\n connection_uri='mongodb://localhost')\r\n backend.register(books)\r\n\r\n\r\nBecause the data is persistent and accessed directly through the MongoDB API,\r\nyou need to make care to keep it in sync. You can trigger a full one-way sync\r\nusing the following management command (TODO: currently not implemented yet\r\nfor the MongoBackend, only for LocMemBackend. Coming soon!)::\r\n\r\n $ ./manage.py denormalize_sync mongo books\r\n\r\nWhenever you update the data through the ORM, the corresponding document will\r\nbe updated automatically. The backend preserves any extra keys you may have set\r\non the document root in MongoDB. Make sure, however, to not add or change keys\r\non subdocuments created by the driver, because they will be overwritten. In the\r\nbook example above, it is safe to set `doc['foo']`, but not safe to set\r\n`doc['authors'][0]['foo']`.\r\n\r\nYou should run full syncs in a cronjob, though, to prevent your data from\r\ngoing out of sync over time due to network outages and changes that\r\nbypass the ORM (see 'bugs and limitations' below).\r\n\r\n\r\nCreating aggregate collections\r\n==============================\r\n\r\nOccasionaly you may want to aggregate data from more than one object on the\r\nroot model. The key differences here are:\r\n\r\n* The output documents do not have a 1:1 relation with the input documents.\r\n* Any change on any root object should trigger an update.\r\n\r\nUse cases:\r\n\r\n* Creating one document with a tree structure of pages or categories\r\n to generate a menu.\r\n* Calculating statistics about data stored in an entire table.\r\n* Generating an index document, mapping one field to\r\n the ids of the documents where the field has a certain value.\r\n\r\n`AggregateCollection` makes this really easy. The following collection will\r\ncreate an index by tag::\r\n\r\n class BookTagIndexCollection(AggregateCollection):\r\n model = Book\r\n name = 'book_tags'\r\n prefetch_related = 'tags'\r\n\r\n def aggregate(self, key):\r\n assert key == 'default'\r\n index = {}\r\n for book in self.queryset().all():\r\n for tag in book.tags.all():\r\n tagname = tag.name\r\n index.setdefault(tagname, set()).add(book.id)\r\n return index\r\n\r\n\r\nFeinCMS support\r\n===============\r\n\r\nDjango-denormalize has experimental special support for FeinCMS. If you use\r\nthe special `FeinCMSCollection`, the `content` attribute will be set to a dict\r\nwith all regions represented as lists. All content types are included by \r\ndefault. If you want to follow relations on content types, you need to \r\nexplicitly define all relations to follow. This will become easier in the\r\nfuture.\r\n\r\n\r\nPerformance optimization\r\n========================\r\n\r\n@@@TODO: explain how to prevent spurious updates using `denormalize.context`.\r\n\r\n\r\nDisadvantages, bugs and implementation notes\r\n============================================\r\n\r\nBugs and limitations:\r\n\r\n* Django-normalize had not yet been extensively tested in real world\r\n applications. Expect bugs. And since it's an early beta release, there\r\n is no guarantee that the API will not change without warning in the near\r\n future.\r\n* Using django-denormalize on models that receive a lot of writes might\r\n significantly slow down your application, as every write will trigger\r\n database queries to determine the affected documents, and regeneration\r\n of the documents that have changes. Keep you view counters and last login\r\n timestamps out of the models included in documents! (You might want to\r\n move these to a NoSQL store anyway.)\r\n* If you bypass the ORM (raw queries, `manage.py dbshell`,\r\n other applications, etc), django-denormalize cannot detect\r\n the changes made to the models. After perform a large batch\r\n operation, flush the Django cache, or run a full sync (denormalize_sync\r\n management command) to update your NoSQL backend, depending on how you use\r\n django-denormalize.\r\n* If syncing to a NoSQL store and the NoSQL database is not available, you\r\n will lose the update, it is currently not rescheduled (TODO: implement\r\n a transaction log to keep track of changes and whether they have been\r\n properly synced or not). You should run a regular full sync in a cronjob.\r\n* Syncing happens only one way. If you want to change data, you need to\r\n perform the modification on the ORM side, not a NoSQL side. We do try\r\n hard not to overwrite any extra attributes you added in the NoSQL backends.\r\n* A full sync currently does not delete stale objects (TODO)\r\n* Keep the storage limitations of your backends in mind. Memcached can only\r\n store objects of up to 1MB, MongoDB has a limit of 16MB. Make sure your\r\n documents will not exceed these limits.\r\n\r\n\r\nTypes of projects that would benefit most of django-denormalize:\r\n\r\n* Writes are rare and mostly occur due to content updates in the Django admin,\r\n like in CMS systems.\r\n* There are a lot more reads than writes, and you want to speed up the read\r\n views, while keeping the front-end personalized and responsive to data\r\n changes.\r\n* You want to use Meteor_ to build the front-end side of your application,\r\n but do not feel like implementing a CMS in Meteor. Django-denormalize\r\n allows you to build the CMS backend using the Django admin and FeinCMS_.\r\n This was the original reason to start this project, so expect more updates\r\n to support this!\r\n* You want to use MongoDB_ to access/query your data, but prefer to keep your\r\n primary data in a traditional, proven, relation database system you have\r\n 10 years experience with, because it makes you or your DBA sleep better.\r\n\r\n\r\nAlternatives\r\n------------\r\n\r\nDjango-nonrel_ allows you to use the Django ORM to directly access a NoSQL\r\ndatabase, but with limitations. If you do a lot of writes from your front-end\r\nviews, or want to prevent data duplication, this might be a better solution.\r\n\r\nPS: Need another backend? Writing one is quite simple! You only need to override\r\na base class, and implement a few methods.\r\n\r\n\r\n.. _Meteor: http://meteor.com/\r\n.. _NodeJS: http://nodejs.org/\r\n.. _FeinCMS: http://www.feincms.org/\r\n.. _MongoDB: http://www.mongodb.org/\r\n.. _Django-nonrel: http://www.allbuttonspressed.com/projects/django-nonrel", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://bitbucket.org/wojas/django-denormalize/", "keywords": "django orm cache mongodb nosql meteor", "license": "LICENSE", "maintainer": "", "maintainer_email": "", "name": "django-denormalize", "package_url": "https://pypi.org/project/django-denormalize/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/django-denormalize/", "project_urls": { "Download": "UNKNOWN", "Homepage": "https://bitbucket.org/wojas/django-denormalize/" }, "release_url": "https://pypi.org/project/django-denormalize/0.2.2/", "requires_dist": null, "requires_python": null, "summary": "Converts Django ORM objects into data documents, and keeps them in sync", "version": "0.2.2" }, "last_serial": 982981, "releases": { "0.1.0": [], "0.2.0": [ { "comment_text": "", "digests": { "md5": "066c8794e30207dc29fe462fedcc4146", "sha256": "94728af4195f069a3083d824f871689b2875883b10425b39bd868f470bc7971e" }, "downloads": -1, "filename": "django-denormalize-0.2.0.tar.gz", "has_sig": false, "md5_digest": "066c8794e30207dc29fe462fedcc4146", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 24598, "upload_time": "2013-05-24T04:53:59", "url": "https://files.pythonhosted.org/packages/6c/ff/2f06ec1d3e82818c961c6945c321615657c14797f8215fcb917e0e26b2b7/django-denormalize-0.2.0.tar.gz" } ], "0.2.1": [ { "comment_text": "built for Darwin-13.0.0", "digests": { "md5": "8beec6d87f9e65890f384f0fecf25172", "sha256": "57c7972544ce1f2d7afe5cf468aa0ade180f2c2967c6c7f43045aa99fd3cc1c3" }, "downloads": -1, "filename": "django-denormalize-0.2.1.macosx-10.4-x86_64.tar.gz", "has_sig": false, "md5_digest": "8beec6d87f9e65890f384f0fecf25172", "packagetype": "bdist_dumb", "python_version": "any", "requires_python": null, "size": 41536, "upload_time": "2014-01-27T15:06:12", "url": "https://files.pythonhosted.org/packages/e2/23/b3104490ab304201ff845848ff945be53ac20d92a9395f3cb43cca583954/django-denormalize-0.2.1.macosx-10.4-x86_64.tar.gz" }, { "comment_text": "", "digests": { "md5": "3bfda2a66d02ec609b6590acf1beb55a", "sha256": "2021198ef0b3ce759515447e7ca097eaad95c377a2cac6b986f1b4a145c4dc63" }, "downloads": -1, "filename": "django_denormalize-0.2.1-py2.7.egg", "has_sig": false, "md5_digest": "3bfda2a66d02ec609b6590acf1beb55a", "packagetype": "bdist_egg", "python_version": "2.7", "requires_python": null, "size": 58784, "upload_time": "2014-01-27T15:06:17", "url": "https://files.pythonhosted.org/packages/2d/a2/cad855669d701e86b0dab36a07f49692e778fe82616dc777956f550a23dd/django_denormalize-0.2.1-py2.7.egg" }, { "comment_text": "", "digests": { "md5": "a561ab2c2666f8396d71f909192de54a", "sha256": "437f1c7eac2ec41a333ca408a26fe92e9a409b0f694343ca14c8b7c55b5d406d" }, "downloads": -1, "filename": "django-denormalize-0.2.1.tar.gz", "has_sig": false, "md5_digest": "a561ab2c2666f8396d71f909192de54a", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 25428, "upload_time": "2014-01-27T14:53:26", "url": "https://files.pythonhosted.org/packages/7e/f5/7c64fe3a4b57a6d739b48cf021633f1110ad0e278353df108db8b030a930/django-denormalize-0.2.1.tar.gz" } ], "0.2.2": [ { "comment_text": "", "digests": { "md5": "5647d6567fb0baf12c2137567f655c41", "sha256": "1a674e044591ce9a18f9022ce6748ad34a6c6ca5d90bfbe35fe76cf947f8bb7b" }, "downloads": -1, "filename": "django-denormalize-0.2.2.tar.gz", "has_sig": false, "md5_digest": "5647d6567fb0baf12c2137567f655c41", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 25137, "upload_time": "2014-01-27T15:12:02", "url": "https://files.pythonhosted.org/packages/9a/9b/814133d15c2b78a5b6142a5f21ddde00da4200ba1b3b862ec970b80c206c/django-denormalize-0.2.2.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "5647d6567fb0baf12c2137567f655c41", "sha256": "1a674e044591ce9a18f9022ce6748ad34a6c6ca5d90bfbe35fe76cf947f8bb7b" }, "downloads": -1, "filename": "django-denormalize-0.2.2.tar.gz", "has_sig": false, "md5_digest": "5647d6567fb0baf12c2137567f655c41", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 25137, "upload_time": "2014-01-27T15:12:02", "url": "https://files.pythonhosted.org/packages/9a/9b/814133d15c2b78a5b6142a5f21ddde00da4200ba1b3b862ec970b80c206c/django-denormalize-0.2.2.tar.gz" } ] }