v0.2 of versioned domain model is now finally done — it was 95% complete in Feb but it has taken another 3 months to iron out the last bugs and polish up!

Apart from being a concrete implementation of a system for versioning data(bases) — and therefore important for efforts to do more collaborative development of data — it is also a crucial piece of infrastructure for various of our current and future projects such as CKAN, Microfacts/Weaving History, and Public Domain Works.

Versioned Domain Model (v0.2) with Support for SQLAlchemy

Versioned Domain Model (vdm) is a package which allows you to ‘version’
your domain model in the same way that source code version control
systems such as subversion allow you version your code. In particular,
versioned domain model versions a complete model and not just individual
domain objects (for more on this distinction see below).

At present the package is provided as an extension to SQLAlchemy and
SQLObject (with an extension to Elixir in progress).

Getting It

Either via easy_install::

$ easy_install vdm

Or from subversion at::

http://knowledgeforge.net/ckan/svn/vdm/trunk

A Full Versioned Domain Model

To permit ‘atomic’ changes involving multiple objects at once as well as
to facilitate domain object traversal it is necessary to introduce an
explicit ‘Revision’ object to represent a single changeset to the domain
model.

One also needs to introduce the concept of ‘State’. This allows us to
make (some) domain objects stateful, in particular those which are to be
versioned (State is necessary to support delete/undelete functionality
as well as to implement versioned many-to-many relationships).

For each original domain object that comes versioned we end up with 2
domain objects:

  • The ‘continuity’: the original domain object.
  • The ‘version/revision’: the versions/revisions of that domain
    object.

Often a user will never need to be concerned (explicitly) with the
version/revision object as they will just interact with the original
domain object, which will, where necessary, ‘proxy’ requests down to the
‘version/revision’.

To give a flavour of all of this here is a pseudo-code example::

# NB: Book and Author are domain objects which has been made
# versioned using the vdm library
# 
# we need a session of some kind to track which objects have been changed
# each session then has a single revision
rev1 = Revision(author='me')
session.revision = rev1
# typo!
b1 = Book(name='warandpeace', title='War and Peacee')
b2 = Book(name='annakarenina', title='Anna')
b3 = Book(name='warandpeace')
a1 = Author(name='tolstoy')
# this is just shorthand for ending this revision and saving all changes
# this may vary depending on the implementation
rev1.commit()
timestamp1 = rev1.timestamp

# some time later

rev2 = Revision(author='me')
session.revision = rev2
b1 = Book.get(name='warandpeace')
# correct typo
b1.title = 'War and Peace'
# add the author
a1 = Author.get(name='tolstoy')
b1.authors.append(a1)
# duplicate item so delete
b3.delete()
rev2.commit()

# some time even later
rev1 = Revision.get(timestamp=timestamp1)
b1 = Book.get(name='warandpeace') 
b1 = b1.get_as_of(rev1)
assert b1.title == 'War and Peacee'
assert b1.authors == []
# etc

More Information

For more information see the documentation in the package repository at:

http://knowledgeforge.net/ckan/svn/vdm/trunk/

Website | + posts

Rufus Pollock is Founder and President of Open Knowledge.