Flex tech articles: DataCollection

Okay, so I’ve decided to start my series of articles with the data collection.

If you’ve worked with LCDS, then you’re familiar with the concept: client side fetches an ArrayCollection from the server, user interacts with it, clicks on submit and the client sends the updated data back to the server. Not the complete collection, only updates.
In case you’re dealing with big collections of massive objects, you might not want to send all the collection up the wire, but only the changes
The reasons are pretty obvious:

  • Network performance. AMF3 is efficient, but if you’re dealing with a few thousand users sending up data that is for 95% noise, well you got room for improvement here. Let’s not even mention xml transfer…
  • Server CPU ressource.  Even though BlazeDS/LCDS does a really great job at serializing/marshalling data, you still have some overhead marshalling your objects. Instantiating objects is never cheap on CPU, sucks up memory and gives the GC more work. Just like the network issue: with a loaded service, you got a lot of room for improvment here.
  • Server code maintainability: if it gets the raw collection, the server will have to figure by himself what has been updated. This will translate in a bunch of database select to get the original object back from the data layer (hopefully from the ORM cache, though), turning those SQLs into Objects (more allocations) and eventually, a big chunk of boiler plate code, comparing existing with the new one. Only then can you get to you business logic.

All this work, for what? Finding that you have to delete one object? Boy, that’s a lot of useless work here…
Wouldn’t you like to only send up the updates?

You could try to do this manually. Not sure you want to open that can of worms.
You’ll have to let UI manage its own collection of deletes/updates/adds, group all of those together and send it back to the server. It could be hard to do depending on the UI, your code is not necessarily reusable. But, knowing flex’s propertyChange and collectionChange events, you could automate everything.

What’s the idea here?
Pretty simple. ArrayCollection is a clever class that dispatches events whenever it is updated (add, remove, update, sort etc.). Well, the idea then is to simply listen to those events and build up automatically a list of updates.

Let’s start with the base, we’re going to need a class to store each individual change. Let’s call it ChangeObject.

This class is merely a placeholder: instance of the changed object, it’s status (added, deleted, updated) and a list of the properties updated.
That’s pretty simple, it would look like this:

public class ChangeObject extends EventDispatcher
{
  public static const DELETED:int = 0;
  public static const UPDATED:int = 1;
  public static const ADDED:int = 2;
  public var status:int;
  public var oldVersion:Object;
  public var newVersion:Object;
  public var changedPropertyNames:ArrayCollection = new ArrayCollection();
  public function ChangeObject()
  {
  } 

  public function isAdded():Boolean
  {
    return newVersion != null && oldVersion == null;
  }

  public function isRemoved():Boolean
  {
    return oldVersion != null && newVersion == null;
  }

  public function isUpdated():Boolean
  {
    return oldVersion != null && newVersion != null;
  }
}

Allright. That was the simple part. Now we want a collection that can create those. Let’s just create a subclass of ArrayCollection and make it encapsulate all the cleverness we need. Everything is encapsulated and since DataCollection is a subclass of ArrayCollection, we can treat it just like a regular collection.
The DataCollection should be listening to itself, for collectionChange events. Those events are pretty damn useful, they tell the nature of the change, the object that’s been updated, and the property that’s been updated (if relevant). Check out CollectionEvent asdocs for more details.
Well, as soon as we trap those events, we’re almost done: we know what has happened and to whom. We’re just left with some ChangeObject creation.
What we have so far in our DataCollection:

public class DataCollection extends ArrayCollection implements IExternalizable
{
  private var _changes:ArrayCollection;
  public function get changes():ArrayCollection
  {
    return _changes;
  }
  public function set changes(value:ArrayCollection):void
  {
    _changes = value;
    if (changes != null)
    {
      _changes.sort = new Sort();
      _changes.sort.fields = [new SortField("status", false, false, true)];
    }
  }
  public function DataCollection(source:Array=null)
  {
    super(source);
    changes = new ArrayCollection();
  }
}

A simple extension of array collection, with an extra attribute holding the changes.
Notice the sort: this will sort changes: deletes first, then updates, then adds. Our server will like this order, because it’s the order that will prevent us from database contraint violation. Indeed, this server will want to process updates in this order, otherwise you might run into some DB constraint violation. For example if the user removed an object, then created another one that is the same, according to business logic. If you have a unique constraint for that column in your table, you must delete the old one first, then create the new one. The same applies for updates and adds.
Also, notice the IExternlizable interface. It is required, we’ll see why later.

Now that we have basic infrastructure, how do we create those objects?
Easy!
Whenever a collection change event comes in, check the kind (event.kind, an “enum”).
For adds:

  • Search in changes for a ChangeObject that has the current object as a “delete”: it means the user has removed the object from the collection, then has put it back in. That results in no logical change on the collection. We will then have to remove that “delete” change object from the list of changes.
  • If we can’t find this ChangeObject, the object is really new, we just have to instantiate the ChangeObject, set the relevant fields and add it to the changes collection.

For deletes:

  • Just like adds, we are first going to look for existing “add” changes objects. If we find one, the object has been added, then deleted: no logical change. We just remove the object from the changes list.
  • Otherwise, the object has really been deleted. Create the change object etc.

For updates:

  • Create our change object if it doesn’t already exist and mark it as “updated”. If the ChangeObject already existed, don’t change the status to “updated”. Indeed, if it was added, the change is still a logical “add” to the collection. Indeed, the server doesn’t know about this object yet, and could not care less that the user created it, then updated. We just want to treat this as a new object.

And that’s it! The code looks like this:

switch (event.kind)
{
  case CollectionEventKind.ADD:
  for each (var addedObj:Object in event.items)
  {
    // look for existing change Object
    var addCO:ChangeObject = getChangeObject(addedObj);
    if (addCO == null)
    {
      // not found, hence create it and add it to changes list
      addCO = new ChangeObject();
      addCO.oldVersion = null;
      changes.addItem(addCO);
    }
    else
    {
      // object was previously removed, it is now added again, resulting in no change: remove change object
      if (addCO.status == ChangeObject.DELETED)
      {
         removeChangeObject(addCO);
      }
    }
    addCO.status = ChangeObject.ADDED;
    addCO.newVersion = addedObj;
    }
   break;
 case CollectionEventKind.REMOVE:
   for each (var removedObj:Object in event.items)
   {
     var removeCO:ChangeObject = getChangeObject(removedObj);
     if (removeCO == null)
     {
       removeCO = new ChangeObject();
       changes.addItem(removeCO);
     }
     else
     {
       // if object was recently added, removing it results in no change for this object: drop change object
       if (removeCO.status == ChangeObject.ADDED)
       {
         removeChangeObject(removeCO);
       }
     }
     removeCO.oldVersion = removedObj;
     removeCO.status = ChangeObject.DELETED;
     removeCO.newVersion = null;
   }
   break;
 case CollectionEventKind.UPDATE:
   for each (var propertyChangeEvent:PropertyChangeEvent in event.items)
   {
     var updateCO:ChangeObject = getChangeObject(propertyChangeEvent.source);
     if (updateCO == null)
     {
       updateCO = new ChangeObject();
       updateCO.status = ChangeObject.UPDATED;
       changes.addItem(updateCO);
     }
     // keep track of changed properties
     if (!updateCO.changedPropertyNames.contains(propertyChangeEvent.property))
     {
       updateCO.changedPropertyNames.addItem(propertyChangeEvent.property);
     }
     updateCO.changedProperties[propertyChangeEvent.property] = propertyChangeEvent.newValue;
     updateCO.newVersion = propertyChangeEvent.source;
   }
   break;
  }
  changes.refresh();
  dispatchEvent(PropertyChangeEvent.createUpdateEvent(this, "change", changes, changes));
}

Notice the refresh and the event dispatching. Refresh is to keep the changes collection sorted. Event dispatching is because we might be interested in binding to changes, hence the event dispatching to notify listeners that the data has changed.

Any call to addItem or removeItem will be caught by the collection. Any changes to a bindable object will dispatch a propertyChangeEvent, get caught by the collection which will translate it to a collectionChange event.
We can even modify a object nested a few levels deep into the collection’s object, as long as the changes bubble, the data collection will catch it.
Sadly, change events don’t bubble by default, so you’ll have to implement the bubbling manually. It’s not complex actually, setting up event listeners and dispatching a new property changeEvent. It’s just tedious and boiler platy. I’ll leave that decision up to you.

Okay, so from a high level, the thing is wrapped up. Now we have to synchronize all this.
In the constructor, hook up the event listener by adding this line:

addEventListener(CollectionEvent.COLLECTION_CHANGE, onCollectionChange);

Let’s not forget sending data up the wire. Remember one of the initial issues: network performance. Let’s just not send everything up the wire. Only the changes.
That’s pretty trivial:

override public function writeExternal(output:IDataOutput):void
{
    output.writeObject(changes);
}

See? Only sending changes :)

We’ll need a readExternal also, since AMF is not doing the marshalling automagically:

override public function readExternal(input:IDataInput):void
{ 
  var object:Object = input.readObject();
  this.source = object as Array;
}

As usual, make you java side class symmetrical to the Flex version: that should be pretty trivial and I’ll let you implement it.

One thing, that’s important: our read/writeExternal methods are not symmetrical anymore. That is, we’re not reading and writing the same thing. If network turns out not to be an issue, or just for having decent and maintainable code, we can make this symmetrical and send everything up and down the wire.
We still have the knowledge on the server of what changes were made.
If network is an issue and maintainable code is a requirement (I hope it is…), it’s easily fixable: the controller can send back up only the changes attribute instead of the whole DataCollection for example.

I’ll leave it up to you to add convenience getters on changes, per type or per instance, enabling/disabling change tracking (just unhook the event listener) and that kind of stuff.

One last thing: ever looked at removeAll implementation? You’d think we’d get a removed collectionChange event per remove? Nope! We get only one reset event. Which is described as “Indicates that the collection has changed so drastically that a reset is required.” It won’t tell us what happened. A work around this is fairly easy: override removeAll and make it call removeItemAt(0) until the collection is empty. Or we could go down the complex path and try to make sense out of the reset event. I didn’t do it and favored the “removeItemAt(0)” approach, but feel free to let me know if you’ve done it the “right” way.

That’s it for today. See you soon for the next article.

Advertisement

One Response to “Flex tech articles: DataCollection”

  1. Upcoming Flex articles « Flex And Cocoa Says:

    [...] Client DataCollection class. For those of you who know Adobe’s LCDS, this is one of their key component. It basically is a collection that’s clever enough to notice what changes are performed on it. The result is a collection that keeps track of its removals, additions and updates. You can then sent only the modified parts of this collection back to the server. This brings you some major advantages: the client and server code is trimmed down to only the relevant business logic (no more boring collection look up code), your service API stays simple (collection oriented) and your client to server traffic is limited to only what is necessary (unmodified items are not sent down the wire). Nothing really fancy here, you just have to understand how flex’s data binding works. That will be the only article that will also have some Java code in it. [...]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s


Follow

Get every new post delivered to your Inbox.