Tuesday, August 31, 2010

Waking up to a New Virtual Reality

My first experience with a virtualized developer environment was back in 2007. I was taking over a project that had been in prod for a a year or so. It wasn’t a very large project and I was just in to fix some bugs and add some features.

But anyone who has been added to a project ‘just’ to add some value knows it can be a tedious affair just to get the development environment up and running. This was a project based on EpiServer (a SharePoint’ish Content Management System) running on ASP.NET 2.0 on the frontend and Sql Server 2005 in the backend.

New VRI would probably have had to spend at least 2 or 3 days just to get my dev machine setup for this project. That would be; if I could get my dev environment set up.

I had never worked on EpiServer before and so just set this up would have been a long and windy road. The version we were running was a couple of versions old and the web contained little to none information on the version we were running. Even getting the correct installer would probably take a couple of days…

Lucky for me this project had been using VMware to virtualize both the developer, test and staging environments, and so getting my first dev build up was merely a matter of installing the VMware client, copy over the virtual image and start it up. Almost too easy!

The dev machine in this case was a Windows XP image with Visual Studio 2008 and Sql Server 2005. An environment quit fit for virtualization. My next project was onsite at a customer with an already setup up machine, and so I didn’t have the chance to virtualize anything there. But starting on a new project again in late 2008, a decided to give VMware a try again.

This time it was on a Vista box (with bitlocker) and a dev environment that also required Vista. Needless to say; Vista rendered, to my big disappointment, useless to both be virtualized and to be the virtualization host. At least for a developer environment.

Then came Windows 7 around and I saw some blog posts on how you could boot directly off a Virtualized Hard Drive (VHD). Meaning you would only suffer about 3-5% performance loss due to virtualization. The only hardware that is actually virtual is the hard drive. Everything else, CPU, memory, graphic card, network, usb, are all non-virtualized. You’re running directly off the hardware.

I didn’t take the time to test out VHD boot as long as I had my dev environment already set up and everything was running fine. But about a month ago I got a brand new Lenovo W510, and so I finally got the ‘excuse’ I needed to give virtualization a new chance. Gold

And I can tell you; so far it’s been pure gold!

From what I’ve experienced so far, here are the pros and cons of VHD native boot on Windows 7:


- Easy to set up new dev environment for testing out new tools, framework, languages or what-have-you. You just need to keep a copy of your ‘base images’ so that you can start fresh from there.

- Easy to get new members of a team up and running. A little disclaimer here as I haven’t actually tried this, but it should only be a matter of running sysprep with the ‘generalize’ option on the virtual machine.

- Backup is just a file copy operation

- Getting up and running on a new physical machine is just a matter of installing Windows 7, edit the boot manager (bcdedit) and copy the VHD-file over to your new machine

- “Avoid” BitLocker; Now this might seem like a strange thing to do, but as a consultant I have two sets of security manuals to confirm to. One for my employer and one for the customer that hires me. Now the security regulation is seldom at level between these, and so I always have to be set up to meet whoever has the highest security bar. Most often that would be my employer.

And as every dev knows; the more layers of security you add to your machine, the longer does your compile take. For some reason I was not equipped with a lot of patience at birth – and I haven’t gotten any since – and so sluggish machines does not suit me well. BitLocker, enterprise anti-virus clients, and other well-intended enterprise security apps, can really suck the life out of any machine. If all you need to do your job is Outlook, Word, Excel and a browser, that would be fine. When you need to compile a 65-project-large solution 400 times a day, it isn’t. And so if I’m working for a client who doesn’t require disk encryption and sluggish anti-virus software, then I’m perfectly fine with that.


Source of performance numbers- A performance hit; I’ve seen 3-5%, but those numbers apparently came out of Scott Hanselman’s butt (his words, not mine) so take that for what it’s worth (I could make a pun and say that ain’t worth sh**, but I’ll refrain myself from adding that kind of toilet humor to this blog). What I can say, though, is that I can’t tell the difference between running on my virtual and my ‘real’ Windows 7 installation. In a blind-test I don’t think I’d be able to tell them apart.

- Hibernation is not supported on the virtualized machine

- Calculation of the Windows Experience Index (WEI) not supported

For me the pros outweigh the cons. The loss in performance is leveled out by not running BitLocker (which also gives you a 3-5% perf hit). Hibernation is nice, but I can live without it and I still have WEI on the host.


My next blog posts will cover how I created my VHDs and got my multi-boot set up.

Sunday, February 14, 2010

Option Explicit On – Commands in CQRS

The idea behind commands in a CQRS architecture is that they should be very explicit and very specific about their intention. You would try to shed away from generic CRUD operations and rather try to capture the essence of what the user is trying to accomplish. Meaning; instead of a ‘one form to save all data’ you would rather let the user explicitly tell what (s)he wants to achieve.

Ok, example. Say you have an application for car dealers. In here you have the possibility to set the price of the car you want to sell. Now, you can either put this as a ‘price field’ among a bunch of other car related data like registration number, brand, model, horsepower, etc, and save it along the rest of the data. Or you can make sure that the changing of a car’s price is a operation of it’s own.

In CQRS you would typically go for the second option. You would make sure that the user’s intent is expressed very explicit by giving this operation it’s own command. That is; instead of putting it inside some big SaveCar method, you make it a method of its own. Something like ChangeCarPrice, or even LowerCarPrice and RaiseCarPrice.

Wouldn’t that be an awful lot of commands, you say? Will you be making a command for every change of value in the application? Hell, no. That would be a lot of commands. And that’s why we don’t do that. We’re making a specific command for ‘car price’ value because the change of this very value is something that has a specific meaning in this domain.

Lowering the price of car is probably an action you need to do because nobody is willing to pay the price you set earlier on. And it can be one of several other marketing actions you can take in order to make the car more saleable. Adding more equipment or freshening up the sales description can be other ‘marketing actions’.

Tracking these specific actions can be very valuable for the car dealer, because having a car taking up space in your warehouse for a longer period of time is bad for profit. Having cars in stock for a minimum amount of time is good for profit. And so tracking which marketing actions that are most effective over time can be very lucrative for a car dealer (or any kind of dealer I guess).


Another way of looking at commands is that they should capture the behavior expressed in your domain model. Patterns like Table Module are arguably more focused on data than behavior, which makes them very well suited for systems where complexity is not that high. And contrary for complex domains; Domain Model is more focused on behavior and less data centric.

I would argue that your average Customer Relationship Management system (CRM) or Content Management System (CMS) are examples of systems were data is more important, or rather more valuable, than the behavior of the system. As to all things in life there’s exceptions, but from my own experience the typical CRM and CMS system would make a good fit for a Table Module or Record Set pattern.

Systems built using data centric models are far easier to build and maintain. That is off course until you start having too much logic – too much behavior – sprinkled around the code. In that case you’re probably better off using using something like the Domain Model pattern.

So let’s focus on the Domain Model again, because in a CQRS architecture there will typically be a Domain Model that contains the essential business logic. The core of the business so to speak.

In a sufficiently complex system there will be a lot of behavior and complex rules attached to those behaviors. Let’s take for instance the aforementioned ChangeCarPrice. Larger car dealers can have hundreds of cars for sale, and all cars will have a designated ‘responsible salesman’. Each salesman can have several cars which they are responsible for and they probably will have some kind of bonus arrangement tied to how many cars they sell.

Imagine a scenario where a potential car buyer walks into the shop. Let’s call our potential customer ‘Johnny’.  Johnny has some preferences to what car he want, but for the most part he’s pretty open to which exact car he’ll end up buying. He’s looking for a 4x4 station wagon, preferably black or dark gray, with diesel engine and leather seats. Johnny’s got about $50.000 to spend on the car - which by the way is a mid-priced car here in Norway. (Yes, I know. It’s an expensive country and everything cost more than it should and blah, blah, blah. It’s a whole other story.)

The salesman of this story, let’s call him Bob, doesn’t have any cars that fits within Johnny’s preferences. At least non that appeals enough to make him leave his $50.000 in the shop. Johnny did however spot a BMW at $55.000 that he really liked, but the extra $5.000 is more than Johnny can afford at the moment. And Bob is not willing to let the BMW go for as little as $50.000, so no business is done.

4 weeks go by and the BMW is still in the shop, but now it’s starting to be costly to having it just standing there, and so the price is lower to $50.000. Wouldn’t it be nice if the Bob’s software were smart enough to notify Johnny about this event?

Yes, it would, but building a system that can handle these kind of events is actually very tricky. Having an explicit command that triggers when the car price changes makes it a whole lot easier to add a business rule like ‘notify customer if price drops to or below $50.000’, because you know exactly were to put that behavior.

If you have a system where business logic has been randomly added from the UI all the way down to the database, this will be a lot tougher job to get done.

So what about the CRUD?

I believe you can still have your ‘store these 30 fields to the database’-operations in a domain driven CQRS architecture. You can have your SaveData command. But commands like that, CRUD commands, were you don’t care about anything but persisting the data, will not trigger any behavior in you domain model. They will just persist data into your relational database, file system, blob storage, or whatever medium that holds your data.

Then when new requirements arrive and you need to attach behavior to some of the data in that SaveData command, you will just extract those properties out into their own command and you make that new behavior explicit.

Maybe even all the way from UI and down to the domain model. That way you will capture the user’s intent and you will have means to encapsulate that precious domain knowledge inside your model.

Further Reading

For more background and resources on CQRS you can take a look at my previous post “Growth is optional. Choose wisely.”.

I mentioned the Domain Model, Table Module and Record Set patterns and there’s no better way to learn about these – and other patterns – than to read Martin Fowler’s excellent “Patterns of Enterprise Application Architecture”. A short description of the patterns can be found in the P of EAA Catalog’s "Domain Model", "Table Module" and "Record Set".

I also touched Domain Driven Design a bit, and again; No better source than the source itself. If you haven’t already – go read Eric Evans “Domain Driven Design – Tackling Complexity in the Heart of Software”. Just do it. And you can come back and thank me for the tip afterwards :)

Thursday, February 11, 2010

Growth is optional. Choose Wisely.

Command Query Responsibility Segregation, or CQRS for short, is an architectural pattern based on the idea of Command Query Separation, CQS. It’s a pattern currently advocated by people like Udi Dahan, Greg Young, Mark Nijhof and Pål Fossmo (see below for links and resources). sw_fake_ballot_sa03045

The background for CQRS is a mathematical theorem called the CAP Theorem put forward by Eric Brewer. It states that;

“You can have at most two of these properties for any shared-data system: Consistency, Availability, and tolerance of network Partitions.”

You can only get two out of three, which basically means that you have to choose between scalability and continuous consistent data. CQRS is an architectural approach that let’s you scale out and deliver high availability, but is a bit more relaxed on the consistency. Meaning that Consistency has to step aside for Availability and Scalability.

Wouldn’t inconsistent data be a bad thing and something we would really strive to avoid? Yes, it would – if data were to be permanently inconsistent. But as long as the data eventually becomes consistent, this is no longer such a bad thing.

After all; how long are data in a multiuser application 100% consistent anyway? Think about it; As soon as the data has left the database – or whatever storage you might have – and is heading up to the user’s screen, someone else could have updated or even deleted the records. The data can be inconsistent even before they hit the screen!

Making a clear separation of commands (writes) from queries (reads) in an application gives you the ability to better scale out the parts that turns out to be bottlenecks. In most applications there are far more reads than writes, and so scaling out the read part will for most scenarios give a performance boost.

Now, calling it ‘eventual consistency’ might sound like it will take ‘forever’ before data is consistent, but just as you can scale the command and query parts of the system, you can also scale out the transport mechanism between them.

The transport is typically some kind of queue, for instance MSMQ-based, and so the time before data is consistent is coherent to the speed of the transport. Throw in some more power on the queuing machinery, and you get more up-to-date data.

Further reading

Udi Dahan’s “Clarified CQRS” is a good and thorough intro to CQRS.

More introductory on CQRS and how it relates to DDD by Pål Fossmo here; Command and Query Responsibility Segregation (CQRS).

Greg Young gives some clarifications on CQS vs CQRS in "Command Query Separation?".

For some more practical samples check out Mark Nijhof’s blog post "CQRS à la Greg Young", were he introduces his demo app on CQRS and Event Sourcing.

Jonathan Oliver has a run-through of CQRS vs Active Record vs Traditional Domain Model in "DDDD: Why I Love CQRS"

If you’re in the mood for some more background material on Brewer’s CAP Theorem, Julian Browne has an excellent article called "Brewer's CAP Theorem – The cool aid Amazon and Ebay have been drinking".

And just when you’re all pumped up and high on CQRS; Read the “CQRS: Crack for architecture addicts” by Gary Shutler. It might get you down on the ground again. I might not agree with him, but he makes some valid points.

And of course, for all DDD-related topics; The Yahoo Group for Domain Driven Design. Lot’s of good discussion there – including CQRS.