Sunday, February 14, 2010

Option Explicit On – Commands in CQRS

The idea behind commands in a CQRS architecture is that they should be very explicit and very specific about their intention. You would try to shed away from generic CRUD operations and rather try to capture the essence of what the user is trying to accomplish. Meaning; instead of a ‘one form to save all data’ you would rather let the user explicitly tell what (s)he wants to achieve.

Ok, example. Say you have an application for car dealers. In here you have the possibility to set the price of the car you want to sell. Now, you can either put this as a ‘price field’ among a bunch of other car related data like registration number, brand, model, horsepower, etc, and save it along the rest of the data. Or you can make sure that the changing of a car’s price is a operation of it’s own.

In CQRS you would typically go for the second option. You would make sure that the user’s intent is expressed very explicit by giving this operation it’s own command. That is; instead of putting it inside some big SaveCar method, you make it a method of its own. Something like ChangeCarPrice, or even LowerCarPrice and RaiseCarPrice.

Wouldn’t that be an awful lot of commands, you say? Will you be making a command for every change of value in the application? Hell, no. That would be a lot of commands. And that’s why we don’t do that. We’re making a specific command for ‘car price’ value because the change of this very value is something that has a specific meaning in this domain.

Lowering the price of car is probably an action you need to do because nobody is willing to pay the price you set earlier on. And it can be one of several other marketing actions you can take in order to make the car more saleable. Adding more equipment or freshening up the sales description can be other ‘marketing actions’.

Tracking these specific actions can be very valuable for the car dealer, because having a car taking up space in your warehouse for a longer period of time is bad for profit. Having cars in stock for a minimum amount of time is good for profit. And so tracking which marketing actions that are most effective over time can be very lucrative for a car dealer (or any kind of dealer I guess).

Behave!

Another way of looking at commands is that they should capture the behavior expressed in your domain model. Patterns like Table Module are arguably more focused on data than behavior, which makes them very well suited for systems where complexity is not that high. And contrary for complex domains; Domain Model is more focused on behavior and less data centric.

I would argue that your average Customer Relationship Management system (CRM) or Content Management System (CMS) are examples of systems were data is more important, or rather more valuable, than the behavior of the system. As to all things in life there’s exceptions, but from my own experience the typical CRM and CMS system would make a good fit for a Table Module or Record Set pattern.

Systems built using data centric models are far easier to build and maintain. That is off course until you start having too much logic – too much behavior – sprinkled around the code. In that case you’re probably better off using using something like the Domain Model pattern.

So let’s focus on the Domain Model again, because in a CQRS architecture there will typically be a Domain Model that contains the essential business logic. The core of the business so to speak.

In a sufficiently complex system there will be a lot of behavior and complex rules attached to those behaviors. Let’s take for instance the aforementioned ChangeCarPrice. Larger car dealers can have hundreds of cars for sale, and all cars will have a designated ‘responsible salesman’. Each salesman can have several cars which they are responsible for and they probably will have some kind of bonus arrangement tied to how many cars they sell.

Imagine a scenario where a potential car buyer walks into the shop. Let’s call our potential customer ‘Johnny’.  Johnny has some preferences to what car he want, but for the most part he’s pretty open to which exact car he’ll end up buying. He’s looking for a 4x4 station wagon, preferably black or dark gray, with diesel engine and leather seats. Johnny’s got about $50.000 to spend on the car - which by the way is a mid-priced car here in Norway. (Yes, I know. It’s an expensive country and everything cost more than it should and blah, blah, blah. It’s a whole other story.)

The salesman of this story, let’s call him Bob, doesn’t have any cars that fits within Johnny’s preferences. At least non that appeals enough to make him leave his $50.000 in the shop. Johnny did however spot a BMW at $55.000 that he really liked, but the extra $5.000 is more than Johnny can afford at the moment. And Bob is not willing to let the BMW go for as little as $50.000, so no business is done.

4 weeks go by and the BMW is still in the shop, but now it’s starting to be costly to having it just standing there, and so the price is lower to $50.000. Wouldn’t it be nice if the Bob’s software were smart enough to notify Johnny about this event?

Yes, it would, but building a system that can handle these kind of events is actually very tricky. Having an explicit command that triggers when the car price changes makes it a whole lot easier to add a business rule like ‘notify customer if price drops to or below $50.000’, because you know exactly were to put that behavior.

If you have a system where business logic has been randomly added from the UI all the way down to the database, this will be a lot tougher job to get done.

So what about the CRUD?

I believe you can still have your ‘store these 30 fields to the database’-operations in a domain driven CQRS architecture. You can have your SaveData command. But commands like that, CRUD commands, were you don’t care about anything but persisting the data, will not trigger any behavior in you domain model. They will just persist data into your relational database, file system, blob storage, or whatever medium that holds your data.

Then when new requirements arrive and you need to attach behavior to some of the data in that SaveData command, you will just extract those properties out into their own command and you make that new behavior explicit.

Maybe even all the way from UI and down to the domain model. That way you will capture the user’s intent and you will have means to encapsulate that precious domain knowledge inside your model.

Further Reading

For more background and resources on CQRS you can take a look at my previous post “Growth is optional. Choose wisely.”.

I mentioned the Domain Model, Table Module and Record Set patterns and there’s no better way to learn about these – and other patterns – than to read Martin Fowler’s excellent “Patterns of Enterprise Application Architecture”. A short description of the patterns can be found in the P of EAA Catalog’s "Domain Model", "Table Module" and "Record Set".

I also touched Domain Driven Design a bit, and again; No better source than the source itself. If you haven’t already – go read Eric Evans “Domain Driven Design – Tackling Complexity in the Heart of Software”. Just do it. And you can come back and thank me for the tip afterwards :)

Thursday, February 11, 2010

Growth is optional. Choose Wisely.

Command Query Responsibility Segregation, or CQRS for short, is an architectural pattern based on the idea of Command Query Separation, CQS. It’s a pattern currently advocated by people like Udi Dahan, Greg Young, Mark Nijhof and Pål Fossmo (see below for links and resources). sw_fake_ballot_sa03045

The background for CQRS is a mathematical theorem called the CAP Theorem put forward by Eric Brewer. It states that;

“You can have at most two of these properties for any shared-data system: Consistency, Availability, and tolerance of network Partitions.”

You can only get two out of three, which basically means that you have to choose between scalability and continuous consistent data. CQRS is an architectural approach that let’s you scale out and deliver high availability, but is a bit more relaxed on the consistency. Meaning that Consistency has to step aside for Availability and Scalability.

Wouldn’t inconsistent data be a bad thing and something we would really strive to avoid? Yes, it would – if data were to be permanently inconsistent. But as long as the data eventually becomes consistent, this is no longer such a bad thing.

After all; how long are data in a multiuser application 100% consistent anyway? Think about it; As soon as the data has left the database – or whatever storage you might have – and is heading up to the user’s screen, someone else could have updated or even deleted the records. The data can be inconsistent even before they hit the screen!

Making a clear separation of commands (writes) from queries (reads) in an application gives you the ability to better scale out the parts that turns out to be bottlenecks. In most applications there are far more reads than writes, and so scaling out the read part will for most scenarios give a performance boost.

Now, calling it ‘eventual consistency’ might sound like it will take ‘forever’ before data is consistent, but just as you can scale the command and query parts of the system, you can also scale out the transport mechanism between them.

The transport is typically some kind of queue, for instance MSMQ-based, and so the time before data is consistent is coherent to the speed of the transport. Throw in some more power on the queuing machinery, and you get more up-to-date data.

Further reading

Udi Dahan’s “Clarified CQRS” is a good and thorough intro to CQRS.

More introductory on CQRS and how it relates to DDD by Pål Fossmo here; Command and Query Responsibility Segregation (CQRS).

Greg Young gives some clarifications on CQS vs CQRS in "Command Query Separation?".

For some more practical samples check out Mark Nijhof’s blog post "CQRS à la Greg Young", were he introduces his demo app on CQRS and Event Sourcing.

Jonathan Oliver has a run-through of CQRS vs Active Record vs Traditional Domain Model in "DDDD: Why I Love CQRS"

If you’re in the mood for some more background material on Brewer’s CAP Theorem, Julian Browne has an excellent article called "Brewer's CAP Theorem – The cool aid Amazon and Ebay have been drinking".

And just when you’re all pumped up and high on CQRS; Read the “CQRS: Crack for architecture addicts” by Gary Shutler. It might get you down on the ground again. I might not agree with him, but he makes some valid points.

And of course, for all DDD-related topics; The Yahoo Group for Domain Driven Design. Lot’s of good discussion there – including CQRS.

Tuesday, December 29, 2009

So you think you can Host?

Flickr CC by 2.0

Through out my career I’ve often come across small-sized dev-shops that believe…

a) That being their own Application Service Provider (ASP) and hosting their product on their own servers is cheaper, easier and safer than letting a third-party handle it

b) That any dev with a bit of interest for servers and hardware is capable of filling the roles of a full-fledge developer and an IT Pro

Through out my career I’ve never seen this work out particularly well.

The reason why it never works is seldom the lack of talent for the poor dev who ‘stood closest to the server when the last dev-slash-it-guy left’ (freely quoted from Richard Campbell of DotNetRocks). It’s just that being a IT Pro is just as much of a full-day job as being a professional programmer.

In this crazy world of new technologies, languages, frameworks, tools and methodologies that pops up every five minutes, there’s just NO WAY a poor soul can handle two full-time jobs like that and still be GOOD AT BOTH. Some things just got to suffer.

Being a developer by heart – and by job description – it’s pretty obvious which one of those jobs that will suffer. The problem is that you can probably live with this situation for a while before it really hits you. But be sure; it will hit you.

You can have 99,5% uptime for 3 years in a row. But when that server goes up in flames and the backup system won’t restore your last 3 months worth of data, you’ve ruined your uptime numbers for the next 3 decades.

Being a IT Pro means being pro-active. It’s a constant fight to stay ahead of any troubles. And to be prepared and having fail-over when trouble hits you.

Being a dev-slash-it-guy means you won’t have neither time nor devotion to being pro-active. Instead you’re being post-active; you’re putting out small fires every now and then, but you’re seldom doing much to prevent them from catching on.

If you’re a startup company with most customers on beta-programs and not much paying customers yet, that might be ok. But someday you’ll hopefully find yourself with a nice list of paying customers that depends on that nice little piece of software that you hacked together wrote.

They might not expect your software to be flawless (even though they probably should), but they expect it to be there when they need it. They start demanding uptime guarantees and Service Level Agreements, SLAs (or at least they should demand guarantees and SLAs). And you better take steps to make sure that you can provide a level of expected professionalism when it comes to hosting your own services.

Do you think you can deliver that with a (at most) half-time IT pro? My best guess is ‘probably not’. image

From my experience in the field, here’s some questions you should start asking yourself if you find yourself at this stage;

(Now, here comes a full disclosure up front; I’m definitely no IT Pro myself – and I have no intentions what-so-ever to become one. This list might therefore not be 100%-water-and-bulletproof, but if you find some misjudgments or something you’d like to add to the list, please feel free to correct me or give suggestions in the comments below)

  • How many ports are open and how many services are running and available from the outside on your public server(s)? (The server(s) that hosts your software that is). Do you for instance allow remote desktop connections to your public server(s) to be able to troubleshoot it?
  • What happens if someone from the outside takes control over your public server? Do they get access to your local network and domain as well?
  • How many servers are actually accessible from the outside?
  • Do you have a working Virtual Private Network, VPN, that anyone in your business can use? And if so; Is it secure enough?
  • How many times in the last 6 months have you verified that you can actually restore all the data from your backup device? And how sure are you that you’re actually backing up everything you need? Or put it this way; if your office burns down today, will you have all the necessary data available to do business-as-usual tomorrow?
  • How often do you scan your network for suspicious activities? Are you sure you’re alone on your network?
  • Do you have a wireless network available in your office? If so; what minimum level of security does it demand? Do you have just a pre-shared key which then gives you full access to the domain, or do you have something that is actually secure enough to prevent teenage hackers to access your file servers?

I’m not saying that any 3-5 man shops must hire a full time IT Pro to handle this. This is off course a question of cost. But just like you’re probably out-sourcing accounting to some professional book-keeper, you should also out-source other areas that is just as critical for your business.

imageIf you’re a small- or medium-sized dev-shop, hosting is in my experience always handled better by professional ASPs. And the same goes for securing and managing your IT infrastructure.

Don’t get blinded by your luck so far; sooner or later your luck will run out. Then it will no longer be neither cheaper, easier nor safer to handle hosting and infrastructure by yourself – and there’s nothing you can do about it.