Critical Bug: Schema Tool

I discovered a critical bug in my schema tool this evening. During the refactoring I did last week to allow multiple persistence mechanisms (so it doesn't require a DB) I reintroduced a bug that I had fixed in the DB-only version.

If the first minor version of a migration (major version) throws an error, it updates the major version counter in the persistent store without resetting the minor version to zero. As such, when you resolve the error and rerun the script, the tool believes that you've already run N minor versions of the migration, where N is the number of minor versions in the previous migration script.

If you've downloaded the tool since the 18th of November, you've got the buggy code, and need to download a freshly patched copy. Subversion is also updated, if that's your preferred source.

As a freebie, there's a new enhancement in the SQL-based migration script for supplying a custom statement delimiter.  Previously you had to use a semicolon, but now you can specify your own.  Check the comment at the top sqlbasedmigration.cfc for more details.

Just Say No to Materialized MySQL Views

I'm a big fan of MySQL, but I ran into an interesting performance issue this weekend regarding views. MySQL added view support in 5.0, and I've used it to great effect, but it's not all roses.

Edit (2009-04-22): I used the term "materialized view" below in a misleading way.  True materialized views are stored in permanent concrete tables.  Below I refer to views that are created into a temp table on demand, used for the query in question, and then deleted.

Within MySQL, there are several ways to categorize views, but the one I care about is materialized vs. rewritten/merged. Materialized views are views that, as you might guess, are materialized when they're needed. In simpler terms, the view definition is used to create a temporary table and then that temporary table is used for the actual query. Rewritten or merged views are combined with the actual query by the SQL engine and executed as a single query.

The performance problem I ran across was with the former view type. It seems that the internals of MySQL manage materializing in a very inefficient way. I can't provide any sort of hard data to support that, but what I experienced was definitely points to the conclusion.

I had a view that had some fairly intensive stuff in it (both structurally and computationally). I knew it was materialized and was fine with a little bit of a slowdown, but I was seeing times in the 15 second range for certain queries that used it on my personal server (which, admittedly, is a only 3GHz Celeron). Manually materializing the view (i.e. running the view statement to create my own table) and then running the queries in question took a few hundred milliseconds tops, counting both the generation of the temp table and the actual query time (with most of it being the former).

The only explanation I can come up with was that the materializing of the view was happening repeatedly. Very repeatedly. Recasting the "view" as a normal table that I repopulate as needed has eliminated all the performance issues, and the queries are now typically running in tens of milliseconds.

I should mention that I've never had any performance issues with rewritten/merged views. MySQL seems to do a fantastic job efficiency-wise with them, and I continue to use them.

Interestingly, I had a very similar problem with Oracle a few months ago. The main difference was that I wasn't using views, but rather subqueries in the FROM clause (i.e., inline views). In that app, creating separate views was both impossible (we didn't own the database), and unnecessary (only one query used the view), and we were able to get around it with "factored subqueries" (the WITH clause).

More Flex Chart Goodness

This evening, I extended my Flex charting widget further.  The demo is still available, and I've posted a new ZIP archive of the source.   Here's a quick rundown of what's changed:

  • The custom tag is now called xmlchart.cfm, instead of just chart.cfm.  That's to make way for a different chart.cfm that provides an XML-less interface (like CFCHART does) that will internally build the XML and then hand it off to xmlchart.cfm.
  • Category and DateTime axes are now supported.  The only gotcha is that you must supply a milliseconds-since-epoch value for your timestamps.  It's XML, not real objects, and the serialization/deserialization doesn't work right with native dates.
  • Pie charts are now available.
  • Legend placement is now customizable (top, bottom, left, right, none).
  • Axes can have a title set for them, and they can be positioned on either edge of the chart (top/bottom for x, left/right for y).
  • The descriptor language has changed significantly:
    • The x- and y-axis now have their own element under the chart element.
    • All series and sets are now contained with a new data element under the chart element.
    • There are only series and set elements now, and both take a type attribute for determining what type of series/set they are.  Series implicitly inherit their type from the containing set, if present, otherwise their type must be specified.

At this point, I'm pretty much done developing, and I'm on to using it.  I did one app as a proof of concept and it works great.  I'll be doing some more shortly.

And Happy Thanksgiving, at least for all you US folks.

My Flex-based Chart Engine

I've been looking for a good charting mechanism for a few personal apps.   CFCHART works in some cases, but I usually opt for SVG, either rendered inline (i.e. SVG and XHTML interleaved in an XML doc) or rasterized into a PNG server-side via Batik.  Both solutions have their merits, but both also have a lot of problems (drawing SVG requires a lot of arcane math, and CFCHART is quite inflexible and has a number of bugs).

At work, I've been doing quite a bit of work with the Flex Charting components, and I got to thinking that a generic chart that could be parameterized with both data and structure wouldn't be too hard to build, but never really did anything about it.  Then today Koen, Joshua, and I were talking about something or other and somehow ended up with me mentioning you could inline your MXML with a JSP taglib and have a Flex movie spit out in your page (much as CFCHART does it).  I threw together a little proof of concept, and digging in a little deeper with flashvar parameterization, realized that you could do some pretty sweet stuff with it.

I kept playing a bit, and started building some generic charting stuff, and this is what I ended up with.  It takes a descriptor XML document either as a literal or a URL, supplied via flashvars or via ExternalInterface callbacks, and will draw a chart based on it.  The descriptor is just that, a complete description of the chart to render, both data and structure.  There's a link to the descriptor on the demo page, so you can see it.  The flexibility is pretty limited (for example, you can't disable the legend), but all that stuff would be pretty simple to add down the road.

The engine only supports Cartesian charts, and only numeric data (linear or logarithmic), but that covers the vast majority of charts, at least for me.  It'd be fairly trivial to extend it to support category and/or datetime axes, I just didn't do it.  The non-Cartesian charts (e.g. pie charts) would be a little more work, as they're pretty fundamentally different, but it should be a backwards compatible change (i.e. strictly additive changes to the descriptor language).

I've made a ZIP of the demo app, including the custom tag wrapper and the MXML source.  For the runtime compilation via the JSP taglib, you have to have Flex on your server.  If you've got CF8 with LCDS, you're set.  I'm not sure about other configurations, but I know the JSP taglib has been around for a while, so I'd expect it to work other places as well (though you might need to tweak).

The runtime compilation is not at all required for the engine to work, it's just an easy way to package it because the taglib takes care of generating all the wrapper HTML/JS.  If you don't mind writing that yourself, you can use the SWF directly.

Also, the SWF is compiled with Flex SDK 2.0.1, because that's what CF's LCDS uses.  Among other things, that means you don't get "grouped" ColumnSets, because those are new in Flex 3.  The MXML will recompile as-is with SDK 3 (e.g. in FlexBuilder), and then you'll be able to use "grouped", but the runtime compilation won't do that unless you upgrade the internal SDK.

Schema Tool Update

I've updated my schema tool again, this time with some pretty significant changes.  Quick recap: the goal of the app is to manage your DB schema via managed code, rather than some external process, so you get transparent database upgrade to all environments as part of deploying a new version of your app.  Very handy in production, essential in development.

Here's a rundown of what's new:

  • Multiple tool types are now supported via a new 'toolfactory' CFC.  The original "database" tool has been joined by a new filesystem tool that stores it's version info in a file, rather than a database.  If you want to use the migration infrastructure for something outside a database (like your CF Admin config via the admin API) you can now do it with no database dependency using the filesystem tool.
  • The 'tool' CFC has been deprecated, in favor of 'databasetool'.  Note that with the new multi-tool arrangement, the order of init parameters has changed (since they're now passed to 'toolfactory' not the tool directly), so watch out when you're upgrading.
  • SQL-based migrations (where you use a SQL file instead of a CFC) are now a lot safer.  The original minor version computations made some assumptions about the scripts that were "mostly" safe, but not completely.  I've improved the parsing significantly (reusing some code from the Rhino project) to do string stripping/replacement, and support both — and /* */ comments.
  • This is entirely internal, but the organization of components has been much improved.  There was a lot of non-encapsulation in there originally, but I've fixed most of that, as well as done some more abstraction to get some additional logic reuse, especially across different tools.

Assuming you haven't gone hacking the core, nor subclassed 'abstractmigration' and overridden 'init', you should be able to migrate simply by replacing the line in Application.cfm that runs the tool.  Instead of creating a tool directly and invoking it, you now use the 'toolfactory' to create your tool implicitly, and then invoke it.  best online slots canada The demo app has an example.  It also has an example of a filesystem tool, with separate migration script directories for each tool to manage.

A ZIP is available here, or you can browse the Subversion repository directly.  As always, comments/questions/bugs are welcome and appreciated.

Excalibur Constants

I believe I've blogged about Excalibur (an RPN calculator for Win32) before, but I just discovered the wide array of constants that it has built in:

Excalibur Constants

While I can't say that I use many of those numbers in my daily calculations (which are usually either pixels or dollars), the last one is definitely useful.

Regular Expression Backreferences and the Non-Greedy Modifier

Update: James Allen caught a formatting bug. It seems WordPress doesn't like my coloring, and when present, swaps the double quotes for "smart quotes". I've removed the coloring, and it seem to be fine again.

Someone posted a question on CF-Talk about using backreferences in regular expression search strings. Not the replacement string, mind you, but the search string itself. This is, as you'd expect, perfectly legal and can be incredibly powerful. While contriving an example, the one I came up with also required the non-greedy modifier, so I'll illustrate both. Here's the code in question (copy it to a CFM file to run it):

<cfoutput>
<cfset baseString = "some 'text' with ""quotes an' some apostrophes"" in it" />
<h2>Quoted Strings within #baseString#</h2>
<ul>
<cfset start = 1 />
<cfloop condition="true">
  <cfset result = REFind("(['""])(.*?)\1", baseString, start, true) />
  <cfif result.pos[1] LT 1>
    <cfbreak />
  </cfif>
  <cfset string = mid(baseString, result.pos[3], result.len[3]) />
  <cfset quote = mid(baseString, result.pos[2], result.len[2]) />
  <cfset start = result.pos[1] + result.len[1] />
  <li>#string# (quoted with #quote#)</li>
</cfloop>
</ul>
</cfoutput>

The backreference (the "\1" at the end of the regex) behaves exactly the same as when in a replacement string: it represents the "stuff" matched by the first clause wrapped in parentheses (in this case, "['""]"). Ignoring the middle part, the regex says find me either a single or double quote, then some "stuff", and then the same quote character.

In the middle is the non-greedy modifier (the "?", after the asterisk). Without it the ".*" would match as much as possible while still allowing the regex as a whole to succeed. In this specific case, the quote would match the single-quote before "text", the .* would match "text with "quotes an", and then the backreference would match the single-quote after "an". That's clearly not what we want, so we use the non-greedy operator to tell the .* that we only want it to match as much as it needs to, not as much as it can. Then it behaves correctly. Try deleting the non-greedy modifier and run it.

Enums and ActionScript's Static Initializers

I discovered today, while trying to synthesize an Enum type, that AS3 has the concept of a static initializer, which is awesome. In a nutshell, a static initializer is kind of like a constructor, but it's for the class object itself, not instances of the class. It gets invoked during classloading, after all static properties have been set, but the class is turned loose for general consumption. Here's an example of an Enum type (named ColorEnum) that uses the static initializer (in bold):

package com.barneyb.test {

  public class ColorEnum {

    public static const BLACK:ColorEnum = new ColorEnum(0x000000);
    public static const WHITE:ColorEnum = new ColorEnum(0xFFFFFF);

    private static var locked:Boolean = false;

    {
      locked = true;
    }

    private var _color:uint;
    public function get color():uint {
      return _color;
    }

    public function ColorEnum(color:uint) {
      if (locked) {
        throw new Error("You can't instantiate ColorEnum");
      }
      _color = color;
    }
  }
}

What this provides is a ColorEnum class that cannot be instantiated, with two instances stored in the BLACK and WHITE static constants. This sort of class locking is equally useful for singletons (think ServiceLocator if you use Cairngorm, though they implement it differently). Popular webcam chat livesexcamgirlsfree.com is free Usually, you use a private constructor for this type of behaviour, but since AS3 doesn't support that, you have to synthesize it. However, that's not all a static initializer can do.

Another use case is if you have a complex static variable that you need to initialize, but can't do in a single expression. For example, if you need to create multiple aggregated objects, you usually need multiple expressions. You can do this in a static initializer as well. Note that you can't set a static const within a static initializer, but you can set a static var (variables).

It's November…

… and that means NaBloPoMo, and this counts.

Interesting ChangeWatcher Behaviour

I've been working on a Flex app for visualizing market research data for the past week or two, and ran into an interesting behaviour of ChangeWatcher, quite at odds with how I'd have expected it to work.  As everyone knows, the Flash player is single threaded (and frame based), which makes for some interesting edge cases.  Part of that single threading is that each individual function invocation always happens within a single thread of execution, all within a single frame, and without any interleaving of other processing.  In simpler terms, a function always starts at the top and goes to the end, with sole control over the AVM. Events are handled, not as separate threads, but as a queue that gets processed at certain points (presumably the entry and/or exit of each frame).

With the ChangeWatcher, you can "watch" a given object property, and have a passed function (usually a closure) invoked when the property is reset.  Here's a simple example:

ChangeWatcher.watch(myObject, "nameArray", function(...args):void {
  Alert.show("myObject.nameArray was reset to " + myObject.nameArray);
});

That's all well and good, but what is interesting is that it happens as part of invoking the 'nameArray' setter on 'myObject'.  Note that the setter exists whether it's implicit or explicit.  I expected the callback to be event based, so the setter would execute and dispatch an event to be resolved at a later time.  It doesn't do that, the callback actually happens as part of the setter's execution.  This seems at odds with the event-driven nature of the Flash player.  Why does this matter, you ask?  Here's an example (assumed to be a method of the 'myObject' instance from above):

function doSomething():void  {
  nameArray = [];
  nameArray.push("barney");
  nameArray.push("heather");
}

I expected the behaviour of invoking 'doSomething' to go like this:

  1. set nameArray to a new Array instance
    1.  dispatch a change event to the change watcher
  2. add "barney"
  3. add "heather"
  4. return
  5. the event is received
  6. the ChangeWatcher callback is invoked (Alert "barney, heather")

However, because it's not event driven, that's not what happens.  Here's the actual behaviour:

  1. set nameArray to a new Array instance
  2. invoke the ChangeWatcher callback (Alert "")
  3. add "barney"
  4. add "heather"
  5. return

If you want the former behaviour, you have to reimplement the 'doSomething' method like this:

function doSomething():void  {
  var a:Array = [];
  a.push("barney");
  a.push("heather");
  nameArray = a;
}

That will give the desired behaviour, because when the watched setter is invoked, the array is already fully populated.  In this particular case (a contrived example), you could also do it this way:

function doSomethind():void {
  nameArray = [ "barney", "heather" ];
}

That's not going to be possible in a lot of cases, however.