{ "info": { "author": "Malthe Borch", "author_email": "mborch@gmail.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Developers", "Operating System :: POSIX", "Programming Language :: Python", "Programming Language :: Python :: 2", "Programming Language :: Python :: 2.6", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.1", "Programming Language :: Python :: 3.2", "Programming Language :: Python :: 3.3", "Topic :: Database" ], "description": "Overview\n========\n\nDobbin is a fast and convenient way to persist a Python object graph\non disk.\n\n The object graph consists of *persistent nodes*, which are objects\n that are based on one of the persistent base classes::\n\n from dobbin.persistent import Persistent\n\n foo = Persistent()\n foo.bar = 'baz'\n\n Each of these nodes can have arbitrary objects connected to it; the\n only requirement is that Python's `pickle\n `_ module can serialize\n the objects.\n\n Persistent objects are fully object-oriented::\n\n class Frobnitz(Persistent):\n ...\n\n The object graph is built by object reference::\n\n foo.frob = Frobnitz()\n\n To commit changes to disk, we use the ``commit()`` method from the\n `transaction `_\n module. Note that we must first elect a root object, thus connecting\n the object graph to the database handle::\n\n from dobbin.database import Database\n\n jar = Database('data.fs')\n jar.elect(foo)\n\n transaction.commit()\n\n Consequently, if we want to make changes to one or more objects in\n the graph, we must first *check out* the objects in question::\n\n from dobbin.persistent import checkout\n\n checkout(foo)\n foo.bar = 'boz'\n\n transaction.commit()\n\n The ``checkout(obj)`` function puts the object in *shared* state. It\n only works on object that are persistent nodes.\n\nDobbin is available on Python 2.6 and up including Python 3.x.\n\nKey features:\n\n- 100% Python, fully compliant with `PEP8 `_\n- Threads share data when possible\n- Multi-threaded, multi-process `MVCC\n `_\n concurrency model\n- Efficient storage and streaming of binary blobs\n- Pluggable architecture\n\nGetting the code\n----------------\n\nYou can `download `_ the\npackage from the Python package index or install the latest release\nusing setuptools or the newer `distribute\n`_ (required for Python 3.x)::\n\n $ easy_install dobbin\n\nNote that this will install the `transaction\n`_ module as a package\ndependency.\n\nThe project is hosted in a `GitHub repository\n`_. Code contributions are\nwelcome. The easiest way is to use the `pull request\n`_ interface.\n\n\nAuthor and license\n------------------\n\nWritten by Malthe Borch .\n\nThis software is made available under the BSD license.\n\n\nNotes\n=====\n\nFrequently asked questions\n--------------------------\n\nThis section lists frequently asked questions.\n\n#) How is Dobbin different from ZODB?\n\n There are other object databases available for Python, most notably\n `ZODB `_ from Zope Corporation.\n\n Key differences:\n\n - Dobbin is written 100% in Python. The persistence layer in ZODB\n is written in C. ZODB also comes with support for B-Trees; this\n is also written in C.\n\n - Dobbin is available on Python 3 (but requires a POSIX-system).\n\n - ZODB comes with support for B-Trees which allows processes to\n load objects on demand (because of the implicit weak\n reference). Dobbin currently loads all data at once and keeps it\n in memory.\n\n - Dobbin uses a persistence model that tries to share data in\n active objects between threads, but relies on an explicit\n operation to put an object in a mode that allows making changes\n to it. ZODB shares only inactive object data.\n\n - ZODB comes with ZEO, an enterprise-level network database layer\n which allows processes on different machines to connect to the\n same database.\n\n - ZODB comes with a memory management system that evicts object\n data from memory based on usage. Dobbin does not attempt to\n manage memory consumption, but calls upon the virtual memory\n manager to swap inactive object data to disk.\n\n#) What is the database file format?\n\n The default storage option writes transactions sequentially to a\n single file.\n\n Each transaction consists of a number of records which consist of a\n Python pickle and sometimes an attached payload of data (in which\n case the pickle contains control information). Finally, the\n transaction ends with a transaction record object, also a Python\n pickle.\n\n#) Can I connect to a single database with multiple processes?\n\n Yes.\n\n The default storage option writes transactions to a single file,\n which alone makes up the storage record. Multiple processes can\n connect to the same file and share the same database,\n concurrently. No further configuration is required; the database\n uses POSIX file-locking to ensure exclusive write-access and\n processes automatically stay synchronized.\n\n#) How can I limit memory consumption?\n\n To avoid memory thrashing, limit the physical memory allowance of\n your Python processes and make sure there is enough virtual memory\n available (at least the size of your database) [#]_.\n\n You may want to compile Python with the ``--without-pymalloc`` flag to\n use native memory allocation. This may improve performance in\n applications that connect to large databases due to better paging.\n\n.. [#] On UNIX the ``ulimit`` command can be used limit physical memory\n usage; this prevents thrashing when working with large databases.\n\n\n\nUser's guide\n============\n\nThis is the primary documentation for the database. It uses an\ninteractive narrative which doubles as a doctest. There's a suite of\nregression tests included in the distribution.\n\nYou can run the tests by issuing the following command at the\ncommand-line prompt::\n\n$ python setup.py test\n\nSetup\n-----\n\nThe default storage option writes transactions sequentially in a\nsingle file. It's optimized for long-running processes,\ne.g. application servers.\n\nThe first step is to initialize a database object. To configure it we\nprovide a path on the file system. The path needn't exist already.\n\n>>> from dobbin.database import Database\n>>> db = Database(database_path)\n\nThis particular path does not already exist. This is a new\ndatabase. We can verify it by using the ``len`` method to determine\nthe number of objects stored.\n\n>>> len(db)\n0\n\nThe database uses an object graph persistency model. Objects must be\ntransitively connected to the root node of the database (by Python\nreference).\n\nSince this is an empty database, there is no root object yet.\n\n>>> db.root is None\nTrue\n\nPersistent objects\n------------------\n\nAny persistent object can be elected as the database root\nobject. Persistent objects must inherit from the ``Persistent``\nclass. These objects form the basis of the concurrency model;\noverlapping transactions may write a disjoint set of objects (conflict\nresolution mechanisms are available to ease this requirement).\n\n>>> from dobbin.persistent import Persistent\n>>> obj = Persistent()\n\nPersistent objects begin life in *local* state. In this state we can\nboth read and write attributes. However, when we want to write to an\nobject which has previously been persisted in the database, we must\ncheck it out explicitly using the ``checkout`` method. We will see how\nthis works shortly.\n\n>>> obj.name = 'John'\n>>> obj.name\n'John'\n\nElecting a database root\n------------------------\n\nWe can elect this object as the root of the database.\n\n>>> db.elect(obj)\n>>> obj._p_jar is db\nTrue\n\nThe object is now the root of the object graph. To persist changes on\ndisk, we commit the transaction.\n\n>>> transaction.commit()\n\nAs expected, the database contains one object.\n\n>>> len(db)\n1\n\nThe ``tx_count`` attribute returns the number of transactions which\nhave been written to the database (successful and failed).\n\n>>> db.tx_count\n1\n\nChecking out objects\n--------------------\n\nThe object is now persisted in the database. This means that we must\nnow check it out before we are allowed to write to it.\n\n>>> obj.name = \"John\"\nTraceback (most recent call last):\n ...\nTypeError: Can't set attribute on shared object.\n\nWe use the ``checkout`` method on the object to change its state to\nlocal.\n\n>>> from dobbin.persistent import checkout\n>>> checkout(obj)\n\n.. warning:: Applications must check out already persisted objects before changing their state.\n\nThe ``checkout`` method does not have a return value; this is because\nthe object identity never actually changes. Instead custom attribute\naccessor and mutator methods are used to provide a thread-local object\nstate. This happens transparent to the user.\n\nAfter checking out the object, we can both read and write attributes.\n\n>>> obj.name = 'James'\n\nWhen an object is first checked out by some thread, a counter is set\nto keep track of how many threads have checked out the object. When it\nfalls to zero (always on a transaction boundary), it's retracted to\nthe previous shared state.\n\n>>> transaction.commit()\n\nThis increases the transaction count by one.\n\n>>> db.tx_count\n2\n\nConcurrency\n-----------\n\nThe object manager (which implements the low-level functionality) is\ninherently thread-safe; it uses the MMVC concurrency model.\n\nIt's up to the database which sits on top of the object manager to\nsupport concurrency between external processes sharing the same\ndatabase (the included database implementation uses a file-locking\nscheme to extend the MVCC concurrency model to external processes; no\nconfiguration is required).\n\nWe can demonstrate concurrency between two separate processes by\nrunning a second database instance in the same thread.\n\n>>> new_db = Database(database_path)\n>>> new_obj = new_db.root\n\nObjects from this database are disjoint from those of the first\ndatabase.\n\n>>> new_obj is obj\nFalse\n\nThe new database instance has already read the previously committed\ntransactions and applied them to its object graph.\n\n>>> new_obj.name\n'James'\n\nLet's examine this further. If we check out a persistent object from\nthe first database instance and commit the changes, that same object\nfrom the second database will be updated as soon as we begin a new\ntransaction.\n\n>>> checkout(obj)\n>>> obj.name = 'Jane'\n>>> transaction.commit()\n\nThe database has registered the transaction; the new instance hasn't.\n\n>>> db.tx_count - new_db.tx_count\n1\n\nThe object graphs are not synchronized.\n\n>>> new_obj.name\n'James'\n\nApplications must begin a new transaction to stay in sync.\n\n>>> tx = transaction.begin()\n>>> new_obj.name\n'Jane'\n\nConflicts\n---------\n\nWhen concurrent transactions attempt to modify the same objects, we\nget a write conflict in all but one (first to get the commit-lock wins\nthe transaction).\n\nObjects can provide conflict resolution capabilities such that two\nconcurrent transactions may update the same object.\n\n.. note:: There is no built-in conflict resolution in the persistent base class.\n\nAs an example, let's create a counter object; it could represent a\ncounter which keeps track of visitors on a website. To provide\nconflict resolution for instances of this class, we implement a\n``_p_resolve_conflict`` method.\n\n>>> class Counter(Persistent):\n... def __init__(self):\n... self.count = 0\n...\n... def hit(self):\n... self.count += 1\n...\n... @staticmethod\n... def _p_resolve_conflict(old_state, saved_state, new_state):\n... saved_diff = saved_state['count'] - old_state['count']\n... new_diff = new_state['count']- old_state['count']\n... return {'count': old_state['count'] + saved_diff + new_diff}\n\nAs a doctest technicality, we set the class on the builtins-module\n(there's a difference here between Python 2.x and 3.x series, which\nexplains the fallback import location).\n\n>>> try:\n... import __builtin__ as builtins\n... except ImportError:\n... import builtins\n\n>>> builtins.Counter = Counter\n\nNext we instantiate a counter instance, then add it to object graph.\n\n>>> counter = Counter()\n>>> checkout(obj)\n>>> obj.counter = counter\n>>> transaction.commit()\n\nTo demonstrate the conflict resolution functionality of this class, we\nupdate the counter in two concurrent transactions. We will attempt one\nof the transactions in a separate thread.\n\n>>> from threading import Semaphore\n>>> flag = Semaphore()\n>>> flag.acquire()\nTrue\n\n>>> def run():\n... counter = db.root.counter\n... assert counter is not None\n... checkout(counter)\n... counter.hit()\n... flag.acquire()\n... try: transaction.commit()\n... finally: flag.release()\n\n>>> from threading import Thread\n>>> thread = Thread(target=run)\n>>> thread.start()\n\nIn the main thread we check out the same object and assign a different\nattribute value.\n\n>>> checkout(counter)\n>>> counter.count\n0\n>>> counter.hit()\n\nReleasing the semaphore, the thread will commit the transaction.\n\n>>> flag.release()\n>>> thread.join()\n\nAs we commit the transaction running in the main thread, we expect the\ncounter to have been increased twice.\n\n>>> transaction.commit()\n>>> counter.count\n2\n\nMore objects\n------------\n\nPersistent objects must be connected to the object graph, before\nthey're persisted in the database. If we check out a persistent object\nand commit the transaction without adding it to the object graph, an\nexception is raised.\n\n>>> another = Persistent()\n>>> from dobbin.exc import ObjectGraphError\n>>> try:\n... transaction.commit()\n... except ObjectGraphError as exc:\n... print(str(exc))\n not connected to graph.\n\nWe abort the transaction and try again, this time connecting the\nobject using an attribute reference.\n\n>>> transaction.abort()\n>>> checkout(another)\n>>> another.name = 'Karla'\n>>> checkout(obj)\n>>> obj.another = another\n\nWe commit the transaction and observe that the object count has\ngrown. The new object has been assigned an oid as well (these are not\nin general predictable; they are assigned by the database on commit).\n\n>>> transaction.commit()\n>>> len(db)\n3\n\n>>> another._p_oid is not None\nTrue\n\nIf we begin a new transaction, the new object will propagate to the\nsecond database instance.\n\n>>> tx = transaction.begin()\n>>> new_obj.another.name\n'Karla'\n\nAs we check out the object that carries the reference and access any\nattribute, a deep-copy of the shared state is made behind the\nscenes. Persistent objects are never copied, however, which a simple\nidentity check will confirm.\n\n>>> checkout(obj)\n>>> obj.another is another\nTrue\n\nCircular references are permitted.\n\n>>> checkout(another)\n>>> another.another = obj\n>>> transaction.commit()\n\nAgain, we can verify the identity.\n\n>>> another.another is obj\nTrue\n\nStoring files\n-------------\n\nWe can persist open files (or any stream object) by enclosing them in\na *persistent file* wrapper. The wrapper is immutable; it's for single\nuse only.\n\n>>> from tempfile import TemporaryFile\n>>> file = TemporaryFile()\n>>> length = file.write(b'abc')\n>>> pos = file.seek(0)\n\nNote that the file is read from the current position and until the end\nof the file.\n\n>>> from dobbin.persistent import PersistentFile\n>>> pfile = PersistentFile(file)\n\nLet's store this persistent file as an attribute on our object.\n\n>>> checkout(obj)\n>>> obj.file = pfile\n>>> transaction.commit()\n\nNote that the persistent file has been given a new class. It's the\nsame object (in terms of object identity), but since it's now stored\nin the database and is only available as a file stream, we call it a\n*persistent stream*.\n\n>>> obj.file\n\n\nWe must manually close the file we provided to the persistent wrapper\n(or let it fall out of scope).\n\n>>> file.close()\n>>> pfile.closed\nTrue\n\nUsing persistent streams\n------------------------\n\nThere are two ways to use persistent streams; either by iterating\nthrough it, in which case it automatically gets a file handle\n(implicitly closed when the iterator is garbage-collected), or through\na file-like API.\n\nWe use the ``open`` method to open the stream; this is always\nrequired when using the stream as a file.\n\n>>> obj.file.open()\n>>> print(obj.file.read().decode('ascii'))\nabc\n\nThe ``seek`` and ``tell`` methods work as expected.\n\n>>> int(obj.file.tell())\n3\n\nWe can seek to the beginning and repeat the exercise.\n\n>>> obj.file.seek(0)\n>>> print(obj.file.read().decode('ascii'))\nabc\n\nAs any file, we have to close it after use.\n\n>>> obj.file.close()\n\nIn addition we can use iteration to read the file; in this case, we\nneedn't bother opening or closing the file. This is automatically done\nfor us. Note that this makes persistent streams suitable as return\nvalues for WSGI applications.\n\n>>> print(\"\".join(thunk.decode('ascii') for thunk in obj.file))\nabc\n\nIteration is strictly independent from the other methods. We can\nobserve that the file remains closed.\n\n>>> obj.file.closed\nTrue\n\nStart a new transaction (to prompt database catch-up) and confirm that\nfile is available from second database.\n\n>>> tx = transaction.begin()\n>>> print(\"\".join(thunk.decode('ascii') for thunk in new_obj.file))\nabc\n\nPersistent dictionary\n---------------------\n\nIt's not advisable in general to use the built-in ``dict`` type to\nstore records in the database, in particular not if you expect\nfrequent minor changes. Instead the ``PersistentDict`` class should be\nused (directly, or subclassed).\n\nIt operates as a normal Python dictionary and provides the same\nmethods.\n\n>>> from dobbin.persistent import PersistentDict\n>>> pdict = PersistentDict()\n\nCheck out objects and connect to object graph.\n\n>>> checkout(obj)\n>>> obj.pdict = pdict\n\nYou can store any key/value combination that works with standard\ndictionaries.\n\n>>> pdict['obj'] = obj\n>>> pdict['obj'] is obj\nTrue\n\nThe ``PersistentDict`` stores attributes, too. Note that attributes\nand dictionary entries are independent from each other.\n\n>>> pdict.name = 'Bob'\n>>> pdict.name\n'Bob'\n\nCommitting the changes.\n\n>>> transaction.commit()\n>>> pdict['obj'] is obj\nTrue\n>>> pdict.name\n'Bob'\n\nSnapshots\n---------\n\nWe can use the ``snapshot`` method to merge all database transactions\nuntil a given timestamp and write the snapshot as a single transaction\nto a new database.\n\n>>> tmp_path = \"%s.tmp\" % database_path\n>>> tmp_db = Database(tmp_path)\n\nTo include all transactions (i.e. the current state), we just pass the\ntarget database.\n\n>>> db.snapshot(tmp_db)\n\nThe snapshot contains three objects.\n\n>>> len(tmp_db)\n4\n\nThey were persisted in a single transaction.\n\n>>> tmp_db.tx_count\n1\n\nWe can confirm that the state indeed matches that of the current\ndatabase.\n\n>>> tmp_obj = tmp_db.root\n\nThe object graph is equal to that of the original database.\n\n>>> tmp_obj.name\n'Jane'\n>>> tmp_obj.another.name\n'Karla'\n>>> tmp_obj.pdict['obj'] is tmp_obj\nTrue\n>>> tmp_obj.pdict.name\n'Bob'\n\nBinary streams are included in the snapshot, too.\n\n>>> print(\"\".join(thunk.decode('ascii') for thunk in tmp_obj.file))\nabc\n\nCleanup\n-------\n\n>>> transaction.commit()\n>>> db.close()\n>>> new_db.close()\n>>> tmp_db.close()\n\nThis concludes the narrative.\n\n\nChanges\n=======\n\n0.3 (2012-02-02)\n----------------\n\n- Add support for Python 3.\n\n- Use C-optimized pickle module when available.\n\n0.2 (2009-10-22)\n----------------\n\n- Subclasses may now override existing methods (e.g. ``__setattr__``)\n and use ``super`` to get at the overriden method.\n\n- Transactions now see data in isolation.\n\n- When a persistent object is first created, its state is immediately\n local. This allows an ``__init__`` method to initialize the object.\n\n- Added method to create a snapshot in time of an existing database.\n\n- Added ``PersistentDict`` class.\n\n- The ``Persistent`` class is now persisted as changesets rather than\n complete object state.\n\n- Set up tests to run using the nose testrunner (or using setuptools).\n\n0.1 (2009-09-26)\n----------------\n\n- Initial public release.", "description_content_type": null, "docs_url": null, "download_url": "UNKNOWN", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "UNKNOWN", "keywords": "object database persistence", "license": "BSD", "maintainer": null, "maintainer_email": null, "name": "dobbin", "package_url": "https://pypi.org/project/dobbin/", "platform": "UNKNOWN", "project_url": "https://pypi.org/project/dobbin/", "project_urls": { "Download": "UNKNOWN", "Homepage": "UNKNOWN" }, "release_url": "https://pypi.org/project/dobbin/0.3/", "requires_dist": null, "requires_python": null, "summary": "Transactional object database, implemented in pure Python.", "version": "0.3" }, "last_serial": 791291, "releases": { "0.1": [ { "comment_text": "", "digests": { "md5": "6c85529b28269d9ae01d7216e1f0b7e0", "sha256": "5d1e4c7f2335b5a70aed336ea94d8152f71678edff84dc031fad729d80d9be44" }, "downloads": -1, "filename": "dobbin-0.1.tar.gz", "has_sig": false, "md5_digest": "6c85529b28269d9ae01d7216e1f0b7e0", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 25140, "upload_time": "2009-09-26T14:03:30", "url": "https://files.pythonhosted.org/packages/37/66/a66bf2408d51c509d8340bc785b79f7672d669b5663adbdbfa33da683b15/dobbin-0.1.tar.gz" } ], "0.2": [ { "comment_text": "", "digests": { "md5": "c5f7611f62e61711440ffa9b835f26d0", "sha256": "b61c1357713f6fdc8a1dc148a784a332838b540182ef960f79d93d13ea1d7c35" }, "downloads": -1, "filename": "dobbin-0.2.tar.gz", "has_sig": false, "md5_digest": "c5f7611f62e61711440ffa9b835f26d0", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 30622, "upload_time": "2009-10-22T11:04:28", "url": "https://files.pythonhosted.org/packages/2e/1a/f6a4ab2f1ef3d2fb79c127b96a703fd2d55143da0259e69a89181cfe40b1/dobbin-0.2.tar.gz" } ], "0.3": [ { "comment_text": "", "digests": { "md5": "737179c70f880b9463a1c996e36f6a50", "sha256": "f41ef2df958fa656e1d390adfac92373dec3385a0f05051c135499ba500a50fc" }, "downloads": -1, "filename": "dobbin-0.3.tar.gz", "has_sig": false, "md5_digest": "737179c70f880b9463a1c996e36f6a50", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 27785, "upload_time": "2012-03-02T14:48:44", "url": "https://files.pythonhosted.org/packages/50/93/4d898410e9dc15f3af40b16f85ce21440ea70723415b3356894bc78b55b4/dobbin-0.3.tar.gz" } ], "0.3-dev": [] }, "urls": [ { "comment_text": "", "digests": { "md5": "737179c70f880b9463a1c996e36f6a50", "sha256": "f41ef2df958fa656e1d390adfac92373dec3385a0f05051c135499ba500a50fc" }, "downloads": -1, "filename": "dobbin-0.3.tar.gz", "has_sig": false, "md5_digest": "737179c70f880b9463a1c996e36f6a50", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 27785, "upload_time": "2012-03-02T14:48:44", "url": "https://files.pythonhosted.org/packages/50/93/4d898410e9dc15f3af40b16f85ce21440ea70723415b3356894bc78b55b4/dobbin-0.3.tar.gz" } ] }