Jul
11th
by
Ben Werdmuller

The Elgg data model

As we bring in some beta testers and kick Elgg 1.0's tyres, I thought I'd write an overview of the architecture and some of the decisions we've made. Be warned: this is a very technical post!

Everything we've done has been to work towards the dual goal of making the system more flexible, and making it the easiest social networking platform to work with. This means coding plugins is faster, and that you can do more with them, and that end users also see the benefits through extensive interface improvements and more consistent design.

Entities and data

One of the biggest improvements we've made to Elgg is the data model. Previously, plugins were responsible for introducing their own database tables, which meant that their data was effectively placed in a ring-fenced bucket away from everything else. They could plug into search, but those results were segmented; there was a section for blog posts below a section for users, and so on.

In the new Elgg codebase, everything runs on a unified data model, based on atomic units of data called entities. Plugins are strongly discouraged from dealing with database issues themselves, which makes for a more stable system that also has visible benefits for the end user. Content created by different plugins can be mixed together in consistent ways, which are programmed using generic principles - in other words, plugins are faster to develop, and are at the same time much more powerful.

So what does it look like?

Elgg classes

(Click for larger.)

Every entity in the system inherits the ElggEntity class. This controls access permissions, ownership and so on; Elgg 1.0 allows you to run multiple sites on the same install, so it also stores the site each element belongs to.

ElggEntity has four main specializations, which provide extra properties and methods to more easily handle different kinds of data.

  • ElggObject - objects like blog posts, uploaded files and bookmarks
  • ElggUser - each user in the system
  • ElggSite - each site within an Elgg install
  • ElggGroup - multi-user collaborative systems, which were called Communities in prior versions of Elgg

Each of these have their own properties that they bring to the table: ElggObjects have a title and description, ElggUsers have a username and password, and so on. However, because they all inherit ElggEntity, they each have a number of core properties and behaviours in common.

  • A numeric GUID.
  • Access permissions. (When a plugin requests data, it never gets to touch data that the currently logged-in user doesn't have permission to see.)
  • An arbitrary subtype. For example, a blog post is an ElggObject with a subtype of "blog". Subtypes aren't predefined; they can be any unique way to describe a particular kind of entity.
  • An owner.
  • A site.

Each of these also implements an interface called Loggable, which means that actions relating to it are stored in the system log (and potentially displayed in the profile river), and one called Exportable, which means that it can be represented using the Open Data Definition, as well as JSON, serialized PHP and XML.

Everything in the system can also have metadata and annotations attached to it, which in turn have an owner, access permissions etc. Metadata is set-once data like tags, profile items, etc; annotations are things like comments and ratings.

Relationships between entities

One of the most interesting things we've done is allowed generic relationships to be estabished between any two entities. A very common example is the relationship called "friend", which always occurs between two users - but there's no reason it has to end there. The system allows developers to establish an arbitrary relationship between any two things in the system, following a familiar pattern: subject, predicate, object. These could then be traversed by a plugin to help create accurate recommendations, or perhaps to fine-tune the accuracy of search results. We think that social usage data can help establish relationships behind the scenes, as well as weight them, and we'll be doing more work on that as time goes on.

Views and templating

Of course, none of this matters if you can't get the data to the user in the way you need to. Luckily, Elgg comes with a powerful templating system.

Each entity in the system can be displayed using a view. Each view has a name, which can potentially be split into subdirectories; for example, 'css', 'page_elements/header' and 'object/blog' are all valid view names. In turn, there's the concept of a view type - by default this is standard HTML, but it could be mobile HTML, RSS, etc. Just as you can create your own views, you can create your own view types. Views are stored in subdirectories under their view type - so a standard HTML version of 'object/blog' is stored in 'views/default/object/blog.php'. The RSS version is then contained in 'views/rss/object/blog.php'. Plugins can extend views or replace them with their own data.

Everything in the system knows how to draw itself; if we're trying to view an ElggObject of subtype 'blog', the system will automatically look for a view called 'object/blog'; similarly, 'user/admin' and so on. Elgg is smart enough to take a stab at creating RSS representations of every entity. (Plugin authors can override the default views of their particular subtypes of entity.)

To get a representation of an entity, you can call:

$fomatted_entity_for_display = elgg_view_entity($entity);

To get a paginated list of blog posts, you can simply call:

$formatted_list = list_entities('object','blog');

And as if by magic, the RSS feed for the same page will be generated for you and linked up in the head section of the page so that the user's web browser knows it's there. View and logic are kept separate in the new Elgg, so creating a mobile web view, RSS, JSON or other versions of the same page are extremely simple. With common formats like RSS, we've made that even simpler, by doing most of the legwork for you.

Because all entities are essentially the same, you also don't need to do anything at all to hook your features into search. If you search for 'foo', you'll get a list of everything matching foo, in one, unified list. (With RSS, natch.) Because access permissions are integrated at a very deep level, users only ever see the items they're been given access to - so the same search will yield a different results set when you're logged in to when you're browsing anonymously.