Continuing the transactions thread, I wanted to talk about a couple specific reasons that transactions are vitally important in ORM applications. Even more than in raw-SQL applications. I consider them mandatory anywhere, but I would argue that if you feel compelled to skip them for some reason, you'd be better off leaving them out of your SQL applications than your ORM applications.
But before I dive into the specifics, I wanted to point out a rather fundamental issue I see with the CFML community: transactions are an afterthought. I've been using CFML for over a decade and the laissez faire approach to transactions never ceases to amaze me. Not individual developers, mind you, I'm talking about CFML developers as a group. If you look at the Spring documentation, for example, the first topic under "Data Access" is transactions. Not DAO support, not JDBC, not ORM. Transactions. And that is absolutely the correct order. We as a community need to get this one sorted. It'll result in better CFML applications, richer CFML toolkits (like native transaction support in ColdSpring and others), improve the public opinion of CFML developers, and just generally be better all around.
Now back to transactions and ORM…
Reason #1
Let's assume for the moment that you're processing the submission of an account creation form (name, username, password) using CF9 ORM. Here's some code you might use (minus any architecture and validation):
a = new com.site.Account(); a.setName(attributes.name); a.setUsername(attributes.username); a.setPassword(attributes.password); entitySave(a);
This is pretty straightforward: create an Account instance, set the properties, and all entitySave(). So where does the actual INSERT statement happen?
No one knows!
It's possible that it'll happen at the entitySave() call (depending on how you're doing unique identifiers). It's possible it'll happen at the end of the request (depending on how "this.ormsettings.flushAtRequestEnd" is set in Application.cfc). It's possible it will never happen at all and the submitted data will be lost.
Egads! Really?
Yes, really.
If we simply wrap that code with a transaction block, all our problems go away:
transaction { a = new com.site.Account(); a.setName(attributes.name); a.setUsername(attributes.username); a.setPassword(attributes.password); entitySave(a); }
Now we can state with complete certainty that by the time the transaction block closes not only will the INSERT have executed, but the new record is assured to be safely persisted within the database. Thank goodness!
It's not all roses though, because using transaction blocks like this can screw with your sessions (ORM sessions, not the session scope). Adobe has fixed the issue, but the initial CF9 bits are broken. So you really want to use Hibernate transactions instead of CFML ones to avoid the issue. If you use ColdSpring, I have a TransactionAdvice specifically for addressing this problem. If not, you just need to write the above code like this:
txn = ormGetSession().beginTransaction(); try { a = new com.site.Account(); a.setName(attributes.name); a.setUsername(attributes.username); a.setPassword(attributes.password); entitySave(a); txn.commit(); } catch (e) { txn.rollback(); throw e; }
It's a couple extra lines of code, but it'll save your sanity – particularly on transactions that return a result which you want to manipulate later. And really, you should just use ColdSpring to manage your service layer and use AOP to do it all magically. :)
Reason #2
CF9 shipped with "this.ormsettings.flushAtRequestEnd = true" as the default, which means that side effects can trickle into your database without you being aware of them. The first thing you should do in any CF9 ORM application is set flushAtRequestEnd to false. In fact, stop reading and go do that right now for any existing applications you have as well. I'll wait.
The primary reason you want this turned off is that you don't want changes being made to your database inadvertently. If you're flushing at request end, innocently calling some setter somewhere (maybe as part of validation) will change your database. Even worse, the change won't make it into the database until after the current request is complete, so you won't be aware of it until the next request at the earliest. More likely, however, you won't notice in development and then your customer will call a week or two after production deployment and ask why all of the records have invalid values in some column or another. Oops.
If you ignored my advice a couple paragraphs back, I beg of you, please go set flushAtRequestEnd to false. Right now.
By disabling flushAtRequestEnd, changes to persistent entities will no longer make it to the database without your intervention. Believe me when I say that this is the way it should be. You, as the application developer, need to be in charge of when and what stuff is made persistent. ORM is about removing the busy work of persistence, it's not about making you unaware of persistence. There are two ways you can do it:
- ormFlush() – this will do a non-transactional flush of every modification in the current ORM session. This is what flushAtRequestEnd invokes at the end of the request. At least before you went and disabled it everywhere, right? Right?
- transactions – allows wrapping of specific changes in a transactional context so the stuff you want get saved when you want. Plus all the "normal" benefits.
Even if you're doing transactions manually, the work required on your part is pretty much identical for the two options, so can get all the benefits of transactionality for "free" by choosing that one. And if you go with transactions, you can probably make things even easier by using AOP to apply them by default, basically freeing you from doing any work at all.
The bottom line is that transactions need to be a carefully considered part of any and every application. And if you're using ORM (which is state based) instead of SQL (which is statement based), then it's even more important.
This is a very very good post. Thanks.
I really appreciate this sort of discussion where we get away from the how-to on a low-level and look at the bigger picture. I take it the email thread on this got you thinking about it, eh?
"By disabling flushAtRequestEnd, changes to persistent entities will no longer make it to the database without your intervention. Believe me when I say that this is the way it should be. You, as the application developer, need to be in charge of when and what stuff is made persistent. ORM is about removing the busy work of persistence, it's not about making you unaware of persistence."
Would another way of putting this be that when the current state of data needs to be saved should be driven by the business rules?
Very much agree with the sentiment – see http://lagod.id.au/blog/?p=43
I do have to mildy disagree with these lines, though:
"If you're flushing at request end, innocently calling some setter somewhere (maybe as part of validation) will change your database. Even worse, the change won't make it into the database until after the current request is complete, so you won't be aware of it until the next request at the earliest."
There's a mental model here that in-memory state is separate from persistent state, and both have to be managed explicitly. I'd suggest one of the important functions of an ORM is to remove this distinction for persistent variables and relieve the application of that responsibility. When I call a setter, I expect the value to stay set – not revert back after some unspecified period of time (at the end of the request, session or application, depending on where I stored the variable). If I want to set a value temporarily, I use a temporary variable. It's not the only way to use ORM, but I think it's the most conceptually consistent.
Jamie,
I'm with you: your domain model should be pristine, regardless of where it lives (db, memory, cache, etc.). But the "pattern" of using your entities to do validation is a common one none the less. In any case, that's not the only place you can invoke a setter. For example, your HTML guy might do what amounts to a CFPARAM inside a view:
<cfif person.getAddress() EQ "">
<cfset person.setAddress("Unknown") />
</cfif>
so that when he's outputting his Person he can just emit the address without having to check. He doesn't know about magic state-based persistence, and this approach had always been perfectly reasonable in the age of recordsets and arrays of structs.
My point wasn't that validation in your domain model is a good thing, more that it's basically impossible to ensure you don't have "leakage" into your domain model, and validation is a common scenario (for better or for worse).
Allen,
Sorry, I missed your comment. Your restating is partially accurate. It's not just the when, but also the what. CF's initial implementation of session-per-transaction helped with the what, because you couldn't move changes into a transaction for committing, but came with enough downsides to make it unusable. You have to be really careful with what you're saving as well as when it happens.
Barney,
Gotcha. So the problem is, how to minimize damage from developers calling setters inappropriately, right? I think I do something similar to what you propose, although because I use native Hibernate the details are different – rather than turning off flush I wrap the entire view layer in a read-only transaction.
Jaime
Jamie,
If you're using raw Hibernate, the read-only transaction is a fantastic tool for ensuring you don't shoot yourself in the foot (and that no one else does either). Unfortunately it's not available to us in CF, at least that I'm aware of.
It's not really about minimizing damage, though. At least in my mind, it's more about failing fast when you create a bug (in this case a non-transactional write). I don't want to protect my database when it happens in production, I want to ensure that the developer gets yelled at when it happens in development so the broken code won't ever make it to SVN. Which isn't to say the in-production benefits aren't nice (they're hella nice), but a hard fail at develop time is a huge win.
Sorry if this is a dense question Barney, but can I do something like this (with flushatrequestend = false):
If this is valid then it makes life a lot easier!
Thanks!
- John
John,
It depends. In this simple example it'll be safe, because creating the new object and calling the setters doesn't require any DB interaction (which would happen outside a transaction). The point is that you have your transaction demarcation in the right spot, just as if you were writing SQL manually. You never do this (read outside a transaction in order to write inside a transaction) with SQL:
The same principle applies with ORM, just it's far more difficult to determine when the data access is happening when you're not in control of the raw SQL. Typically this sort of thing would crop up with defining relationships, but could also be caused by lazy loading across a multi-table entity, certain primary key generators, etc.
Thanks Barney – I hadn't even considered lazy loading. That's a really good point!
Great post Barney – very good to know. Can you clarify what you mean by "Adobe has fixed the issue, but the initial CF9 bits are broken. " ?
Does that mean in CF9 with latest HotFix/update (or next HF) we can use the CF transaction function and not ormGetSession().beginTransaction() or is it fixed in a yet to be released HF/update?
Thanks
Johan,
The initial CF9 release is broken regarding the way sessions are handled across transaction boundaries. The issue has been resolved by the CF team at Adobe. I would imagine a change of this magnitude would NOT come out as part of a hotfix, but only as a point release, so I'd be surprised if we see it before the official 9.0.1 drop. I have no information about when that might be.
Thanks for clarifying.
@Barney – just out of interest – I use ormGetSession().beginTransaction() as you recommend and it works fine on MySQL and Derby (embedded CF version used for quick tests).
However if I use thread.txn = ormGetSession().beginTransaction() where a separate thread is used to processes data in the background it works fine in MySQL by Derby cannot handle any requests until the thread processing is complete. It gives error like this: java.sql.SQLTransactionRollbackException: A lock could not be obtained within the time requested.
Its not a problem for my apps – just curious why Derby chokes.
Johan,
I really have no idea, but my suspicion is that because Derby is an in-JVM DB, it somehow ties together the JVM thread and the DB connection. When you start doing multithreaded stuff that one-to-one association is no longer valid, and things go south. That's pure supposition, however, and quite likely is actually an issue with the CFML engine not using ThreadGroups correctly rather than with Derby.
Thanks