ƒµ©Ҝ You WordPress

In my WordPress MU migration, I made the delightful discovery that it will not serve on a 'www' subdomain.  Period.  It will answer, but it will never regard a 'www' subdomain as canonical, and always assume the bare domain is the right spot to serve from.

Fuck You, WordPress.

While I certainly agree that the behaviour of typing 'www.barneyb.com' and 'barneyb.com' into a browser should be the same in nearly all cases (and it is in mine), using the bare domain as the canonical name is absurd.  If I'm hitting a website, it's canonical address should be the 'www' subdomain.  Exactly the same as every other protocol significant enough to warrant a globally accepted "standard" subdomain.  Setting up the bare domain to forward to the proper canonical subdomain is good manners, but the bare domain is not the web host.

WordPress MU, on the other hand, requires exactly the opposite.

My web site is located at www.barneyb.com.  Period.  End of story.  If you're lazy and type in 'barneyb.com', it'll happily forward you to the right location, because I'm a courteous administrator, but you'll not stay on 'barneyb.com'.  I'm lazy in most of my one-off demo apps and don't enforce canonicalization, so in those cases you can see either one in the address bar, but for my "official" web presence it's www.barneyb.com and only www.barneyb.com.

Perhaps worth mentioning is that my OpenID lives at http://barneyb.com/; it does not live at http://www.barneyb.com/.  Why?  Because while OpenID is HTTP-based, it's not part of my web site, and so doesn't belong on the 'www' subdomain.

I also split my HTTPS traffic to a separate subdomain (ssl.barneyb.com), though if you go there you'll be redirected back to www.barneyb.com unless you enter a specific application's path.  All those applications are not part of my "official" website; they're things like EventLog, Pic of the Day, timecard/invoicing, financial tracking, etc.

I'm going to intentionally not  dwell on this any more this evening.  My gut reaction is to say "you're stupid, I'm not going to use you" and stick to my Ant-based solution.  I'm not sure a bit of time to "cool down" will change that decision, but I'm going to give WPMU the benefit of the doubt and delay the decision until I'm not quite as pissed off.  It doesn't deserve it (software is supposed to serve the user, not the other way around), but I'm a courteous administrator after all.

Update (2009-05-04): I've come back and patched WordPress MU to allow the use of a 'www' subdomain.  Hopefully the change will eventually make it back to the main codebase, but I'm not holding my breath.

Routing Issues

Complex Drive is having some issues with routing traffic to my current server from certain locations. So if you can't get to my site, that's why. Seems that a relatively small portion of the internet actually passes through the offending router, so that's good. Unfortunately, Mentor is one of those locations, so I'm typing this in elinks, that champion of text-mode browsers. Hopefully they'll get it sorted soon.

Server Stories

Got a whole bunch of work done on my new box this evening.  Still lots to do, but well on my way.  Pluggable on-box backups for MySQL, pluggable S3 backups for arbitrary files, a WordPress MU skeleton, a few Tomcats, base HTTPD config, even a ColdFusion instance.  Unfortunately, the majority of my stuff runs on either www.barneyb.com or ssl.barneyb.com, which means moving those domains is an all-or-nothing arrangement.  It'll be a while before those move, but I did move one site (a nearly codeless Magnolia-managed site) over to the new box, so it's not a complete deadweight.

The WordPress stuff is going to be the biggest mess, I think.  I've been running a home-grown multiblogging app based on WordPress for the past several years and I'm doing away with it.  It's built with Ant and while you treat it as a single copy of the code, in reality you end up with a separate WordPress installation for each blog (with appropriate customizations, themes, etc.).  It gets the job done, but it's rather of cumbersome in various ways (no shared users, can't serve from a working copy, complex directory structure, etc.), so I'm going to bite the bullet and "upgrade" to MU.  Should make my life enormously easier in the long run, but in the short term I'm going to have to do some pretty invasive surgery to some plugins I cut corners on when I wrote them originally.  Also going to have to figure out how to migrate all the data into the slightly different MU schema.

Second to that will be Pic of the Day, simply because it has so many moving pieces, but I've done a lot of organizational work on it over the past year so it'll be a lot easier now than it would have been last summer when I was considering migrating to a new box.  I was hoping to switch to Railo to get it in an isolated JVM, but there are some issues preventing that, both with Railo and my code.  I don't have the RAM to run multiple copies of ColdFusion, nor the money to buy more, so it'll stay on a shared instance for now.  Fortunately, despite the fact that it's constantly churning it's gears, most of it can be taken offline for a day or three with no user-visible effect.  The user-visible portion of the app is a very narrow slice of the whole picture.

Finally, I'm trying really hard to ditch JRun for Tomcat everywhere.  The JRun connector is truly amazing in how it lets you virtualize your webroot, so it's not easy.  However, now that I've got Apache 2.2, I can bring mod_proxy_ajp and mod_rewrite to bear and hopefully get equivalent functionality.  I'm willing to sacrifice little bit of filesystem cleanliness for having a single web container for all my sites, but not much.  Fortunately, that's something I should be able to continuously refine and optimize over time, even after stuff is live.

I'm hoping to be all moved over within the month.  Long time, I know, but there's a lot of stuff to do, and this is  a "free time" project.  With the inability to effectively test and migrate individual pieces, I have to be really careful and that always takes longer.  Wish me luck!

Watch Your Column Types

I've been bit by this twice in the past few months: comparing database columns that aren't the same type is really really expensive.  If you've only got a few rows, no big deal, but if you've got a few hundred thousand (or a few tens of millions) it makes a huge difference.  And varchar is NOT the same as nvarchar on SQL Server.

OBD is On The Ball

Both of the bugs I found in Open BlueDragon while working on CFGroovy have been fixed.  Yay for being on the ball.  The bugs were that CFDIRECTORY didn't recurse if you used a filter and that numberFormat didn't accept java.math.BigDecimal as a valid number.

CFGroovy Updates

I made a bunch of updates to CFGroovy tonight, mostly centered around two main objectives:

  1. refocus on the original objective, inline Groovy scriptlets
  2. support Open BlueDragon

The first objective is based on my experience developing the last couple apps I've used CFGroovy on.  Hibernate is kickass, but neither of the apps used it; they only used CFGroovy for scriptlets.  What I discovered was that while using CFGroovy is pretty simple, it's not really all that friendly towards a pure-scriptlet use case.  Adding Hibernate provided a huge boost in capabilities, but it also roughed up the edges of the core Groovy engine quite a bit more than I had realized, and you pay the price regardless of whether you're leveraging Hibernate or not.  As powerful as Hibernate is, I still think scriptlets are the real win.  If you get scriptlets, Hibernate becomes just another Java API you can easily leverage from CFML.

So I've started going back through with a focus on smoothing those things back out.  The specific objective is to be able to download CFGroovy, drop the JAR into my /WEB-INF/lib folder, CFIMPORT the taglib, and run <g:script> in a production-ready fashion.  Obviously it can't hope to support a custom classpath, precompilation, Hibernate, etc., but that's not the point if you just want scriptlets.  There's no reason CFGroovy can't support both use models equally well.  Immediately following that objective is that taking an app and "upgrading" to a more customized CFGroovy configuration should be transparent to the code.  Just new configuration in Application.cfc or ColdSpring and the app automatically adjusts.

And yes, CFGroovy no longer bootstraps itself into it's own little world, including loading it's own Groovy JAR.  Compared to spinning up Hibernate the cost of doing that is a drop in the bucket, but it's not acceptable for my above-stated goal for various reasons.  So I've reverted to requiring installation of the JAR.  Certainly an added hardship, but to get production-ready performance and memory consumption out of the box, it has to be that way.  It also makes the internals enormously simpler.

The other major change is that CFGroovy is no longer implemented in Groovy, it's 100% CFML again.  This is for much of the same reason as the JAR loading: the price is too high in terms of performance and memory consumption for the pure-scriptlet class of development.  I'd rather implement in Groovy than CFML because everything is Java APIs, but bootstrapping a Groovy environment to bootstrap the CFGroovy environment effectively doubles the initialization work.  Only measured in tenths of seconds at the most, and therefore invisible if you're spinning up Hibernate, but a non-trivial penalty if you aren't.

Second, Open BlueDragon support is mostly there.  The pre-1.0 releases had some classloading issues that prevented stuff from working, so I just wrote it of.  However, I decied to give it another try tonight on a whim, and those issues seem to be resolved.  The core CFML->Groovy->CFML flow works perfectly, but there are a number of warts.  OBD doesn't support recursive CFDIRECTORY so a number of things simply fail, including packaged classes.  It also doesn't handle all the Java numerics so if you do floating-point math in Groovy you may or may not get back a number you can use in CFML.

That said, if you checkout trunk from Subversion onto a OBD instance, you'll be able to run all the Groovy demos except "Simple Objects" (because of the numerics issue – java.math.BigDecimal specifically).  The Hibernate integration currently depends on packaged classes which OBD can't compile, but if I can get access to the datasource information as I can on CF and Railo, I think Hibernate will work as well.

Finally, I want to give a big shout out to my version control system.  The ability to go back and selectively revert individual changes in individual files from the past four months saved me hours and hours of headaches.  If you don't have version control set up for every line of code you write, stop reading right now and go do that.  Then you can come back and finish.

As always, source is available at https://ssl.barneyb.com/svn/barneyb/cfgroovy/trunk/demo.  That's the demo app, as you'd imagine from the URL, which includes the actual CFGroovy engine.

CFGroovy for Open BlueDragon!

Doing some hacking around this evening and got CFGroovy to successfully run simple scripits on Open BlueDragon 1.0.1.  It still doesn't support array/struct literals (WTF?) so the engine won't run in it's entirety, but the core integration flow works.  That's promising.  There are still a pile of issues to work out (like some major path-related problems), but it's coming.

And no, no idea about Hibernate support.  That one is WAY more complicated.  : )

The Latest ColdFusion Mindƒµ©Ҝ

It's pretty common knowledge that ColdFusion passes arrays to UDF by value, and not by reference like pretty much every other language.  It's a weird behaviour, but as long as you remember to write array-processing functions to be used like this:

<cfset myArray = modifyArray(myArray) />

instead of like this:

<cfset modifyArray(myArray) />

you'll be fine.  However, someone pointed out on the Railo mailing list that this behaviour is not constrained to UDFs.  Assignment behave the same way!  Yes, if you assign an array to another variable, you get a copy.  The only "assignment" that is exempt from this behaviour is if you use structInsert.

Consider this code:

<cfset a = [37] />
<cfset a2 = a />
<cfset s1 = {array = a} />
<cfset s2 = {} />
<cfset s3 = {} />
<cfset s4 = {} />
<cfset s2.array = a />
<cfset s3["array"] = a />
<cfset structInsert(s4, "array", a) />
<cfset arrayAppend(a, 42) />
<cfset System = createObject("java", "java.lang.System") />
<cfoutput>
<p>#System.identityHashcode(a)# - #a.toString()#               <!--- [37, 42] --->
<p>#System.identityHashcode(a2)# - #a2.toString()#             <!--- [37]     --->
<p>#System.identityHashcode(s1.array)# - #s1.array.toString()# <!--- [37]     --->
<p>#System.identityHashcode(s2.array)# - #s2.array.toString()# <!--- [37]     --->
<p>#System.identityHashcode(s3.array)# - #s3.array.toString()# <!--- [37]     --->
<p>#System.identityHashcode(s4.array)# - #s4.array.toString()# <!--- [37, 42] --->
</cfoutput>

At the end of this code, there are five distinct arrays, one containing 37 and 42, and the other four containing only 37.  The blue lines create the four copies, and the red line creates a new "pointer" to the existing array.  Then when the green line executes the original array has 42 added to it, which only affects 'a' and 's4.array', because the rest are copies.

On Railo, arrays are universally reference types, so there would only be a single array in the above example.  This is consistent with most other languages.

Functional Programming Languages

Ever done functional programming?  Chances are you'll say "no", but you'll probably be wrong.  Javascript is a functional language, and while a lot of people use it in a procedural and/or object oriented way (\me raises hand), it's foundation is functional.  Same deal with ActionScript.  Used Groovy?  Ruby?  Python?  None are functional (let alone pure), but all have significant functional aspects.

So what is a purely functional language?  In a nutshell, it's a language that doesn't have the concept of mutability.  Things never change in a functional environment, all that happens is that new things are derived from old things.  Here's an example, which should be familiar to anyone who has done Groovy/Ruby/Python or used jQuery/Prototype:

nameArray = userObjectArray.pluckField("firstName").sort()

No mutation.  Only derivation.  It's like magic.  What's happening is the pluckField method is creating a new array with first names, and then the sort method is creating a new array in sorted order and that third array is what is set to the 'nameArray' variable.

This is ridiculously powerful, because it eliminates scoping issues from the mix.  There is no place a race condition can crop up anywhere, because everything is immutable.  What does that mean to you the developer?  It means the end of worrying about thread safety.  No more locks or scoping issues (e.g. 'var' in CFML).  It also results in really readable code.  Start at the left, read every word, and if you have decent function names it's pretty obvious what happens.

Of course, sometimes change is good, so most functional languages have the concept of mutability.  The pure ones are typically reserved for the math nerds, since algorithms in a pure functional language can actually be "proven" just like a mathematical proof.

I don't really know what my point is, except that CFML continually pisses me of with it's lack of any type of functional nature.  It sort of has higher order functions, but since it lacks closures, they're of minimal utility.  Even currying would give them some basic helpfulness, though certainly a kludge.

I was writing a simple Google Charts wrapper (to replace an SVG/Batik engine) on an app that doesn't have Groovy available to it and it's such a friggin' pain.  It's almost enough to make me want to go build a CFGroovy Lite that I can cheat into place more easily than the full framework.  Not that Groovy is even close to purely functional, but it's easy to use as if it is.

If only Clojure wasn't Lisp; I don't have that many parentheses.  A JVM-based functional language with a C/Java-style syntax would be truly excellent.

jQuery.bind() Data

If you only ever use the type-specific event helpers (.click(), .load(), .change(), etc.), you're potentially missing out on a really handy feature of jQuery: bind data.  Bind data is data associated with the bound handler function, available to every invocation of it, but not in any "normal" variable scope.  It's kind of like currying, except with an event attribute instead of arguments.  You declare bind data like this (as the optional middle parameter of the .bind() method):

jQuery("#selector").bind("click", {name: "barney"}, clickHandler);

And then you use it like this:

function clickHandler(e) {
  alert(e.data.name);
}

The 'e' variable is the standard jQuery event object passed to all event handler functions.  The 'data' key within it contains the bind data created when the handler function was bound, if any.  Even this simple case is potentially interesting because you're controlling the alert message at bind-time, not in the function itself, without the extra wrapper function you'd otherwise use:

jQuery("#selector").click(function() { clickHandler("barney")});
function clickHandler(name) {
  alert(name);
}

But that's not the really useful bit.  What you might not have noticed about the two examples is that in the first, the string "barney" is evaluated once (at bind time), but in the second it's evaluated every time the click event is dispatched.  No big deal for a string, but for complex expressions that can create a performance issue.  Since event handlers are typically executed more than once (and often a lot more than once), moving stuff out of the callback and into bind data can actually yield a noticeable difference in UI performance, as I learned this evening.

In a photo viewing app I built I'm using JS to display multiple photos sequentially via a single IMG tag, just swapping out the source, width/height, title, etc.  I wanted to center the photo tag, but couldn't do it with CSS because some dynamically-sized absolute-positioned elements needed to be accounted for, as well as the size of the photo itself.  I'd originally written a handler that looked something like this:

jQuery("#theImage").load(function() {
  var base = getOffsets("content");
  var total = jQuery("#floatyControlBar").position().left;
  var object = jQuery("#imageContainer");
  var xd = Math.max(0, (total - object.width()) / 2)
  object.css({
    left: base.x + xd + "px",
    top: base.y + "px"
  });
});

It worked great, except that it made a noticeable jump from the old position to the new position when the image loaded.  I first tried the Flash-style "add an animation to mask the slowness", but the browser couldn't handle sliding a big image around very gracefully, so it wasn't any better.  Enter bind data.  Changing the handler to this resolved the issue:

jQuery("#theImage").bind("load", {
  base: getOffsets("content"),
  total: jQuery("#floatyControlBar").position().left,
  object: jQuery("#imageContainer")
}, function(e) {
  var xd = Math.max(0, (e.data.total - e.data.object.width()) / 2);
  e.data.object.css({
    left: e.data.base.x + xd + "px",
    top: e.data.base.y + "px"
  });
});

Basically I just changed it so most of the data collection happens at bind time (so it only happens once) and as little as possible happens when the handler is actually called.  It's a simple optimization, but it eliminated the visible jump.  The code has paid a slight price in readability, no question there, but while we all tout readable code as a hallmark of good code, user experience is still what software is about, so the latter wins.

Of course, you could achieve the same effect by setting the base, total, and object variables into the scope containing the .bind() call so the handler can access them via closure-nature.  But that pollutes (breaks encapsulation), so using bind data is a better solution in most cases.