Persistence model in v5

Monday, January 23, 2012 by Morten Christensen

The idea with this post is to give a basic understanding of the persistence model in Umbraco v5 with a primary focus on the objects, which I expect are the ones you as an Umbraco developer will need to be familiar with. This post isn't a full blown deep dive into persistence in Umbraco v5, but it will  help you understand the inner workings of dealing with persistence.
Please note that this is primarily intended for developers, who wants to familiarize themselves with the low level API in v5. A higher level API is in the works, which should make it simple for all types of developers (frontend, backend etc.) doing tasks like creating/updating/deleting a content item, querying and publishing.

In Umbraco v4 there were a lot of different concepts like a Document, a Node, a CMSNode and Media.  When dealing with the API  you would use Document for programmatically creating a content item, but after the item was published you would treat it as a Node. So basically you would use different approaches for dealing with data depending on the type of item (i.e., content, media, document type, etc.) you were working with in the Umbraco backoffice. In v5 its all about the TypedEntity object. All the different types of items in the Umbraco backoffice are based on the TypedEntity object, which makes the API much simpler to work with.

Lets dive a little deeper into the TypedEntity object and explore some of the related objects, which you need to know about.
Basically the TypedEntity is the object you want to store in your persisted storage and what you request when you want to retrieve data and display it on your site or in the backoffice.
A TypedEntity consists of an EntitySchema, a TypedAttributeCollection, a list of AttributeGroup objects and a RelationProxyCollection.
The EntitySchema is used to describe the data that you want your TypedEntity to store - hence the word schema.
The TypedAttributeCollection is, as the name implies, a collection of TypedAttribute objects. A TypedAttribute contains the actual value(s), which you want your TypedEntity to store.
An AttributeGroup is a group to which certain types of attributes belong. You can think of a 'group of attributes' as being similar to a 'tab with properties'. The groups which are available for this TypedEntity is based on the AttributeDefinitions assigned to the EntitySchema.
The RelationProxyCollection is an enumerable sequence of RelationProxy objects, which I won't go into details about. For now you just need to know that it contains proxy classes for Relations.

So now we know a little bit about what the TypedEntity looks like. The EntitySchema is also fairly important to learn about as it contains the schema for our TypedEntity, so before putting all of these objects into context I'll outline the EntitySchema object.
An EntitySchema consists of a SchemaType, AttributeTypes, AttributeDefinitions, AttributeGroups, XmlConfiguration and a RelationProxyCollection.
SchemaType is simply a string which represents which type of schema this EntitySchema is. As an example this could be "file", "system", "content", "user" etc.
AttributeTypes is a collection of AttributeType objects, which describes the type of attribute we are dealing with. So this is the schema for our TypedAttribute, which we use to describe how an attribute is stored.
AttributeDefinitions is a collection of AttributeDefinition objects, which define each attribute that is added to a schema. An AttributeDefinition stores an Alias, Name, AttributeType and AttributeGroup.
The AttributeGroups collection on the EntitySchema is Get-only, which returns an EntityCollection with all the groups defined in the AttributeDefinitions.
XmlConfiguration is a XDocument which is used to store the selected icon / thumbnail and description for a document type.
The RelationProxyCollection is similar to that found on the TypedEntity except here we find proxies for the EntitySchema's relations.

Lets try to put the above into the context of a File object, which we already have in the source of Umbraco v5. The File object is a strongly typed object based on TypedEntity with its own EntitySchema (FileSchema), which is used to describe the properties on our File object.

This is what the FileSchema object looks like:

namespace Umbraco.Framework.Persistence.Model.IO
{
    public class FileSchema : EntitySchema  {
        public FileSchema()
        {
            SchemaType = FixedSchemaTypes.File;
            AttributeDefinitions.Add(new AttributeDefinition  {
                Alias = "name",
                Name = "Name",
                AttributeType = AttributeTypeRegistry.Current.GetAttributeType(StringAttributeType.AliasValue),
                AttributeGroup = FixedGroupDefinitions.GeneralGroup
            });

            AttributeDefinitions.Add(new AttributeDefinition  {
                Alias = "rootedPath",
                Name = "RootedPath",
                AttributeType = AttributeTypeRegistry.Current.GetAttributeType(StringAttributeType.AliasValue),
                AttributeGroup = FixedGroupDefinitions.GeneralGroup
            });

            AttributeDefinitions.Add(new AttributeDefinition  {
                Alias = "rootRelativePath",
                Name = "Root Relative Path",
                AttributeType = AttributeTypeRegistry.Current.GetAttributeType(StringAttributeType.AliasValue),
                AttributeGroup = FixedGroupDefinitions.GeneralGroup
            });

            AttributeDefinitions.Add(new AttributeDefinition  {
                Alias = "publicUrl",
                Name = "Public URL",
                AttributeType = AttributeTypeRegistry.Current.GetAttributeType(StringAttributeType.AliasValue),
                AttributeGroup = FixedGroupDefinitions.GeneralGroup
            });

            AttributeDefinitions.Add(new AttributeDefinition  {
                Alias = "isContainer",
                Name = "Is Container",
                AttributeType = AttributeTypeRegistry.Current.GetAttributeType(IntegerAttributeType.AliasValue),
                AttributeGroup = FixedGroupDefinitions.GeneralGroup
            });

            AttributeDefinitions.Add(new AttributeDefinition  {
                Alias = "contentBytes",
                Name = "Content Bytes",
                AttributeType = new BytesAttributeType(),
                AttributeGroup = FixedGroupDefinitions.GeneralGroup
            });
        }
    }
}


The schema has five AttributeDefinitions, which corresponds to the properties on the File object, which is shown below.
Each definition is put in the GeneralGroup AttributeGroup and assigned a Name and Alias, which is used to put values in the TypedAttributeCollection of the TypedEntity. Each AttributeType in the AttributeDefinitions are strings, so the standard StringAttributeType is used.
And this is what the File object looks like:

namespace Umbraco.Framework.Persistence.Model.IO
{
    public class File : TypedEntity  {
        public File()
        {
            this.SetupFromSchema<FileSchema>();
            IsContainer = false;
        }

        public File(TypedEntity fromEntity)
        {
            this.SetupFromEntity(fromEntity);

        }

        public File(HiveId id)
            : this()
        {
            Id = id;
        }

        public string Name
        {
            get { return (string)Attributes["name"].DynamicValue; }
            set  {                
                if (!RootedPath.IsNullOrWhiteSpace() && Path.GetFileName(RootedPath) != value)
                {
  var rootLocation = RootedPath.Substring(0, RootedPath.LastIndexOf(Name));
                    Attributes["rootedPath"].DynamicValue = Path.Combine(rootLocation, value);
                }
                Attributes["name"].DynamicValue = value;
            }
        }

        public bool IsContainer
        {
            get { return (bool)Attributes["isContainer"].DynamicValue; }
            set { Attributes["isContainer"].DynamicValue = value; }
        }

        public string RootedPath
        {
            get { return (string)Attributes["rootedPath"].DynamicValue; }
            set  {
                Attributes["rootedPath"].DynamicValue = value;
  if (Name != Path.GetFileName(value))
                {
                    Attributes["name"].DynamicValue = Path.GetFileName(value);    
                }
            }
        }

        public string RootRelativePath
        {
            get { return (string)Attributes["rootRelativePath"].DynamicValue; }
            set { Attributes["rootRelativePath"].DynamicValue = value; }
        }

        public string PublicUrl
        {
            get { return (string)Attributes["publicUrl"].DynamicValue; }
            set { Attributes["publicUrl"].DynamicValue = value; }
        }

        public byte[] ContentBytes
        {
            get  {
                var content = (byte[])Attributes["contentBytes"].DynamicValue;
                if (content == null)
                {
                    if (LazyContentStream == null)
                        return new byte[0];  using (var streamValue = LazyContentStream.Value)
                    {
                        ContentBytes = content = streamValue.ReadAllBytes();
                    }
                }
                return content;
            }
            set  {
                if (IsContainer)
                    throw new InvalidOperationException(
                        string.Format(
                            "Entity '{0}' is a container and hence cannot have content assigned to it. To set content ensure that the IsContainer property is false.",
                            Id));

                Attributes["contentBytes"].DynamicValue = value;
            }
        }

  public Lazy<Stream> LazyContentStream { get; set; }
    }
}

As the code shows we have three options for constructing the File object. All of them use the FileSchema which has been defined for this type of entity.
The interesting part of the File is how we get and set the various properties. Notice that we are using the underlying TypedAttributeCollection to get and set the actual values of the properties. You might also want to note that because we are using dynamic we can add more than one value, which in turn has to correspond with the defined AttributeType.

This is how we save a File object using Unit of Work and Hive when an image is being uploaded in the backoffice:

private HiveId StoreFile(HttpPostedFileBase file)
{
    var hive = BackOfficeRequestContext.Application.Hive.GetWriter<IFileStore>(new Uri("storage://file-uploader"));

    var mediaId = Guid.NewGuid();   

    using (var uow = hive.Create())
    {
        var f = new File  {
            RootedPath = mediaId.ToString("N") + "/" + file.FileName
        };
        var stream = file.InputStream;
        if (stream.CanRead && stream.CanSeek)
        {
            stream.Seek(0, SeekOrigin.Begin);
            using (var mem = new MemoryStream())
            {
                stream.CopyTo(mem);
                f.ContentBytes = mem.ToArray();
            }
        }

        uow.Repositories.AddOrUpdate(f);
                
        uow.Complete();

        return f.Id;
    }
}


I hope you are getting the broader picture here, because with a schema defined we can store any object that derive from TypedEntity as the schema is telling Hive how the data should be stored and retrieved.

Summary description of the concepts/classes involved:
TypedEntity - Contains a key value collection of attributes, ea. a collection of TypedAttribute objects. The TypedEntity also contains the schema that defines it.
TypedAttribute - Contains the actual value(s) and conforms to an AttributeDefinition.
EntitySchema - Definition of a schema used by a TypedEntity.
AttributeDefinition - Definition of the attribute ea. Alias, Name, AttributeType and AttributeGroup.
AttributeGroup - The group which the attribute belongs to.
AttributeType -The type of the attribute, i.e. string, integer, bool, text.
- A side note: An AttributeType has a SerializationType, which implements IAttributeSerializationDefinition. This tells Hive the format the data is stored in (String, LongString, SmallInt, LargeInt, Decimal, Date, Guid, Boolean, ByteArray).

There are a lot of predefined AttributeType(s) in Umbraco v5, which you can re-use. They can be found in the Umbraco.Framework.Persistence project under the following folder \Model\Constants\AttributeTypes.

If you want to see more examples of defining your own AttributeType, AttributeDefinition and AttributeGroup I recommend that you take a look at the source code of the WordPress Hive provider, which Alex created for a presentation in Sweden last year: https://bitbucket.org/boxbinary/hive-wordpress-provider/src/bb3055fd06ad/Umbraco.Hive.Providers.Wordpress/Schema/Model

I will do a follow post about repositories in v5, which should give some insight into the various repositories we work with when we want to persist a TypedEntity, EntitySchema etc. in Umbraco v5.

9 comment(s) for “Persistence model in v5”

  1. Gravatar ImageRichard Soeteman Says:

    Hi Morten,

    Great article! question, you write "A higher level API is in the works". Any idea when this API will be available?

    Keep up the good work #h5yr,

    Richard

  2. Gravatar ImageTom van Enckevort Says:

    Very informative article!

    Will there be an easy way to generate TypedEntity objects from ORM entities (like NHibernate or EF)? Or will you have to define those all by hand?
    It would be nice if it automatically could generate the attributes and types based on the ORM entity properties.

  3. Gravatar ImageMichiel Says:

    "All the different types of items in the Umbraco backoffice are based on the TypedEntity object, which makes the API much simpler to work with."

    "A TypedEntity consists of an EntitySchema, a TypedAttributeCollection, a list of AttributeGroup objects and a RelationProxyCollection."

    "An EntitySchema consists of a SchemaType, AttributeTypes, AttributeDefinitions, AttributeGroups, XmlConfiguration and a RelationProxyCollection."

    That doesn't necessarily sound much simpler :-) Will you be able to hide some of the details behind a simple API so we don't have to know about such internals (when we just want to get things done)?

  4. Gravatar ImageMorten Christensen Says:

    @Richard Its difficult to say at this point in time, but expect it after the first release. Remember that v5 is kinda of a 1.0 release for this new version.

    @Tom Do mean generate entities from another DB then Umbracos or?

    @Michiel Yes, there will be a much simpler API available - its what I meant by the higher level API. We are very conscious of developers who just wants a simple API to get the job done without needing to know everything about the internal workings of Hive etc. So expect to see more posts in future about how to work with Razor in v5.

  5. Gravatar ImageAaron Powell Says:

    @Michiel - You should only need to know this stuff if you're creating your own Hive interaction layer, CRUD operations against your own data. Generally speaking you shouldn't need to do it.

    That said it really does come across as complex and having implemented some TypedEntity classes myself it is very complex. The naming is also very confusing, in one case Entity is a suffix and in the other it's a prefix and the word Attribute appears so frequently it's just crazy (there's an AttributeAliasAttribute in the source :P).

    But once you get your head around it all it really is powerful. I'd love to see a more in-depth look at how to work with the Attributes on a TypedEntity, I didn't know about the DynamicValue property and I'd love to see how best to store a dictionary into Hive.

  6. Gravatar ImageRuben Says:

    I'm currently working on implementing a hive provider and a tree for our own custom data. I cannot help but feel that the entire hive and tree apis have been over-engineered.

    Is there any reason why you could not use the .NET RTTI along with property and class attributes instead of rolling your own parallel RTTI?

  7. Gravatar ImageMichiel Says:

    @Aaron that's nothing, I once created a CustomAttributeBuilderBuilder

  8. Gravatar ImageAaron Powell Says:

    @Michiel - hah I'm not sure if that means you win or lose :P

  9. Gravatar ImagePatriek van Dorp Says:

    Are there any resources on how to deploy Umbraco 5 onto Windows Azure? I haven't been able to find anything other than a blogpost that describes a way to save media in Blob Storage.

Leave a comment