Sudoku HiddenSingleStrategy

The next Sudoku solving strategy I want to dig into is called "hidden single".  Here's the implementation to start with:

class HiddenSingleStrategy implements Strategy {

  static getTests() {
    [
      new Test(
        '0' * 9 + '000200000' + '0' * 9 + '000060000' + '0' * 9 + '000080000' + '0' * 9 * 2 + '000002000',
        {
          def cell = it.getCell(5, 5)
          cell.isKnown() && cell.value == 2
        }
      )
    ]
  }

  boolean play(Board board) {
    for (h in board.houses) {
      for (n in 1..9) {
        def cells = h.findAll {
          ! it.isKnown() && it.hasMark(n)
        }
        if (cells.size() == 1) {
          // we found a hidden single
          cells[0].setValue(n, this)
          return true
        }
      }
    }
    false
  }

}

As you can see, the test is a little different than for GameRulesStrategy.  Before I just supplied a board and a successful test was when the board is solved.  In this case I'm just checking for if cell (5, 5) is known and it's value is 2 (no need to solve the whole board).

The strategy itself is pretty simple: for every house, for every number, see if any cell is the ONLY cell in the house that can contain the number, and if so, set that cell to that number.  Once we find a cell to set, the method exits (returning true) and does not continue the search.  This is the reason for the 'for' loops instead of the Groovy 'each' iterator.  This lets GameRulesStrategy "clean up" the board before continuing the search.  In this case it's probably not a big deal, but with the more complex strategies this is an essential optimization.

Testing LaTeX Support

I just set up the WordPress \LaTeX plugin so I can do some formula stuff here.  This is just a little test post with some example formulas taken from my sun/earth collision model.

Euler's Identity (at different sizes): e^{i \pi} + 1 = 0

e^{i \pi} + 1 = 0

Gravitational Constant: G= 6.673 \times 10^{-11} m^3\: kg^{-1}\: s^{-2} = 6.673 \times 10^{-11} \frac{m^3}{kg\:s^2}

Force due to gravity: F = G \frac{M_1 M_2}{r^2}

Inline also works (e.g., 2 + 2 = 4).

If you use shortcodes within a paragraph (e.g., 3 + 1 = 4) instead of the inline notation, you'll get a block. (This is a custom mod.)

Finally, if you put an inline in its own paragraph, it'll do this:

8 \div 2 = 4

Unless you explicitly state that it is a block:

8 \div 2 = 4

The reverse of that, as you'd expect, lets in-paragraph shortcodes work too (e.g., 3 + 1 = 4).

If you want to see the raw \LaTeX, it's in the tooltip/title of the image.  WordPress.com provides the encoding service, though the plugin supports a local/third-party image service as well.

Front Controllers Should NOT Extend Application.cfc

I've been playing with FW/1 a bit on a personal app, and it has proven incredibly frustrating due to multiple manifestations of a single problem: your Application.cfc HAS to extend the framework in order to use the framework.  My complaint really has nothing to do with FW/1 in particular, the exact same argument could be made against Fusebox's Application.cfc integration (but FB at least provides a "normal" way to use it).  And just to be clear, I'm also not railing against Sean Corfield, even though he happens to be the author of both FW/1 and Fusebox's Application.cfc integration.

The first problem is due to Adobe's seemingly mindless choice to require the use of Application.cfc for per-request settings (datasource, mappings, ORM config, etc.), rather than doing it the right way with tags (in the style of the CFAPPLICATION tag).  With Application.cfc being the only place you can define any of this, you cannot use Application.cfc for an individual frontend, since it has to be shared across ALL frontends.

Consider a prototypical blog.  It has a public side (where the public reads), and an admin side (where authors write).  Two separate front ends; one single application.  If your front controller is bound to Application.cfc, you're forced to either run two separate applications or a single dual-purpose frontend.  Either one is a mess, either reducing encapsulation or increasing duplication.  At the very least.

Now consider a different example: an app with a single frontend that also needs one, single, solitary, standalone page for something.  Maybe even just for a quick one-off test script.  You create 'test.cfm' in your directory (so it gets the proper Application.cfc context so you can do your ORM magic), and hit it in the browser.  Oops, your framework decided with it's onRequest handling that it's going to just do it's thing, completely ignoring your template.  Different manifestation, same problem, though this one can be mostly addressed by overriding onRequest with custom behaviour that conditionally invokes super.onRequest.

Like the majority of places where inheritance is used, the proper solution is composition.  Rather than having your Application.cfc extend the framework, let your application compose the framework into itself.  That way it happens on the application's terms, rather than the framework's terms.  I understand that just extending the framework is desirable for ease of initial setup, so I'm not saying you can't do that, just you (as a framework author) should provide an alternative (like Fusebox's fusebox5.cfm).  Then I (as the application developer) can decide how the framework should be used.

Just to be clear, the problem with using inheritance where composition is the correct choice rests on both the shoulders of Adobe (for making request settings part of Application.cfc, rather than composed in with tags), and framework authors (for requiring Application.cfc to extend the framework).  Addressing either of these problems would handle the first manifestation (paragraph 3), but only the framework authors can deal with the second manifestation (paragraph 4).

Bottom line, don't wire yourself in so you can self-invoke, let me invoke you when and where I want you.

Goodbye JRun, Hello Tomcat

Last night I finally kicked JRun to the curb and replaced it with Tomcat on my personal server.  I knew JRun was a pig and that I'd get some improvements by switching, I'd just never gotten around to doing it.  I don't know why I waited.  Here are two graphs from my server, see if you can figure out when the switch happened:

Migration was mind-numbingly simple:

  1. download and unzip Tomcat 6.0.24
  2. change server.xml so the HTTP connector listens on 8082
  3. copy everything from the JRun webroot to the Tomcat webroot
  4. update Apache's config to proxy to Tomcat's port number (8082) instead of JRun's (8300)
  5. start Tomcat, stop JRun, restart Apache

Since I'm running CF8, nothing else is required.  If I were running CF9, I'd need JTA (from http://java.sun.com/javaee/technologies/jta/index.jsp) so Hibernate would work, since Tomcat doesn't bundle it (it's a web container, not an app server).

If you looked carefully at the graphs you'll have noticed two "Tomcat" series, rather than one.  I neglected to increase the MaxPermSize when I first set it up, so the "swap" at about 8:45am was making that change.  The "LCYSA" process is another Tomcat running a pair of Magnolia instances (in different contexts), "Apache" is Apache HTTPD, and "MySQL" is the MySQL 5.0 server I store everything in for all apps.

The charts are generated by a simple CF app that parses `ps v -A O-v` output into a database, and then builds Google Charts from the data. You can view the app, or check out the source. It's not elegant, refined, maintainable, or really even very functional, and I had no intention of sharing until magnus asked in the comments. But I also prefer transparency, so there it is.  A crontab runs `capture_mem_state.sh` every 10 minutes, and then the CFML reads in the files it generates. Play online games on the friv games site with the whole family. Only the best online games are presented on this mega portal.

Interesting ColdFusion CFQuery "Feature"

And just to be clear, when I say "feature" I actually mean "completely broken behaviour," though for a change of pace it's not with the compiler.  Probably.  Hopefully.

Consider this innocent-seeming code:


  insert into test (s) values (
    'dsn_A - record 1'
  )

Every CF app has this (or equivalent) in a million places.  Now let's change it slightly to make the string we're inserting dynamic (I'll show the 'getMsg' UDF in a moment):


  insert into test (s) values (
    'dsn_A - record 2 #getMsg()#'
  )

So far so good.  Now here's the UDF:


  
  
    select count(*) as c
    from test
  
  

What would you expect this to do?  Return a string, of course, containing the recordcount of the 'test' table in "dsn_A", and it does this perfectly.  It also has the glorious side effect of changing the outer CFQUERY's datasource to "dsn_A".  To put that another way, a given query does NOT execute with the datasource passed to the opening CFQUERY tag.  Rather, it executes with the datasource passed to the last CFQUERY tag before the closing CFQUERY tag. Don't believe me?  Here's a test case: Play the best friv games web-site online. The most popular collection of friv games are presented on this mega portal.


  
  
    select count(*) as c
    from test
  
  



  insert into test (s) values (
    'dsn_A - record 1'
  )


  insert into test (s) values (
    'dsn_B - record 1'
  )



  select * from test





  select * from test





  insert into test (s) values (
    'dsn_A - record 2 #getMsg()#'
  )


  insert into test (s) values (
    'dsn_B - record 2 #getMsg()#'
  )



  select * from test





  select * from test


In order to run it, you'll need to two DSNs configured ("dsn_A" and "dsn_B"), pointed at two distinct databases, each with a 'test' table containing a single varchar column named 's'.

This was tested on CF8 and CF9, with and without CFQUERYPARAM.  I also tested with implicit and explicit datasources on CF9.  Same behaviour in all cases.  Wheeee…

Fusebox's Noose

From an email Sean Corfield sent to the Fusebox5 mailing list (http://tech.groups.yahoo.com/group/fusebox5/message/4566):

I just wanted to provide a brief update on [Fusebox and 4CFF]. 4CFF discussed Fusebox with TeraTech (specifically John Zhu of 4CFF and Michael Smith of TeraTech) and were unable to reach an agreement on Fusebox joining 4CFF. One particular sticking point was that Michael wanted TeraTech to be paid for the Fusebox copyright and name.

This is exceptionally disappointing news.  Fusebox is an ancient web framework, regardless of what language you use, and has been under TeraTech's control for only a brief portion of that lifespan.  The vast majority of development and building of community happened before TeraTech took the reins, and even under TeraTech's "management", most contributions have come from outside their company.

Hal Helms (who helped shepherd Fusebox for many years) and John Quarto-von Tivadar (who provided both the architecture and initial implementation of Fusebox's current compiled model) held the rights to the Fusebox name and IP (via The Fusebox Corporation) for many years, and turned it over to TeraTech in 2006.  Since that time, Sean has developed and released Fusebox 5.5 and 5.5.1, and little else has happened.  This is not an exemplary track record for TeraTech.

In the past month I have made some significant optimizations inside the 5.5.1 cores to reduce memory consumption and speed the compiler.  These efforts were partially sponsored by a third party, have been donated to TeraTech on behalf of myself and the sponsor, and are slated to be released as Fusebox 5.6 sometime this year.  I also helped John with the 4.0 architecture, wrote a significant portion of the 4.5 release, and contributed much of the 'cf' lexicon, along with various smaller bits and pieces along the way.  In addition to the core files themselves, I have been a member of Team Fusebox for a number of years, and an active member in the community for much longer.  To my knowledge, I have not received official credit for these contributions aside from source comments within the 'cf' lexicon verbs and being listed on the Team Fusebox web page.

As someone who has invested significantly in Fusebox over the years, this refusal of TeraTech to contribute the name and IP to 4CFF over the issue of money is a slap in the face.  I know Fusebox has helped innumerable developers over the year (myself included), and that is more than compensation enough for the contributions I've made, but to see TeraTech have the gall to demand payment for what I (and many others) have freely donated is very disappointing.

I sincerely hope that Michael and TeraTech will reconsider their position on the issue.

WordPress 3.0

I just pulled down the latest and greatest from WordPress's SVN repository to give the new 3.0 code a whirl.  Specifically, I wanted to see what was happening around the WP.org and WPMU merge that is one of the big features of 3.0, since I run a bunch of WP.org blogs that I'd love to have multihosted, and at work we run an WPMU installation and are in the works of three more.  Having a single way to do everything is quite compelling, especially if there's a way to transparently "upgrade" from a standalone site to a multihosted one.

I was initially disappointed when I looked at the source, because the MU bits seemed to be missing, but a little more digging showed telltale signs that it was at least partially integrated.  After setting up the installation, I found a mysterious "Network" option in my "Tools" menu, clicked it, and was presented with basically the MU installer UI.  Sweet!

Submitting that resulted in a page with three simple instructions: create the blogs.dir directory, install this new wp-config.php script (in a TEXTAREA), and update .htaccess with this new config (also in a TEXTAREA).  My suspicion is that if PHP had write access it would've done all three automatically.

After following instructions and clearing my cookies (not explicitly mentioned), I had a fully functional multihosted WordPress install with the normal MU site administrator menu and everything.  This is awesome.  And just in case that was unclear, it's awesome.  The database, of course, was refactored behind the scenes.  Unfortunately it still uses the tableset-per-blog paradigm (just minus the "_1″ infix for the root blog) rather than a single tableset with foreign keys, but there's no way they could change that without having all kinds of upgrade issues for existing WPMU installations.

There are still a pile of things to update (like the per-blog permalink config shouldn't attempt to write .htaccess) to get the WP.org functionality to morph into the WPMU functionality, but for the "normal" authoring tasks it appears to be as solid as ever.  I suspect a careful administrator could, with a bit of patience, produce a security setup that would let authors have the run of their site without breaking anything even before the actual restrictions are put in place by Automattic.

In any case, very exciting to see this integration project already so far along.  WordPress.org says 3.0 is slated for March, which is crazy soon, but I'm all for it.  I've been waiting to upgrade a pile of WP.org blogs to a WPMU instance for a while now, and this is going to make it much easier.

Sudoku GameRulesStrategy

A couple days ago I posted about implementing a Sudoku solver in Groovy as a sort of cross-training exercise, and promised to delve into the strategies a bit more.  So here's GameRulesStrategy:

class GameRulesStrategy implements Strategy {

  static getTests() {
    [
      new Test(
        '900637100030001000107520300004810500019000820006059400002048705000200080001793006'
      )
    ]
  }

  boolean play(Board board) {
    def madePlay = false
    board.cells.findAll {
      it.isKnown()
    }.each { c ->
      c.houses.flatten().findAll {
        it != c && ! it.isKnown()
      }.each {
        madePlay = it.removeMark(c.value, this) || madePlay
      }
    }
    madePlay
  }

}

As with all strategies, it starts with test cases that exercise the strategy: in this case a single board that can be solved by application of the rules alone.  That board spec, just for reference, describes this board:

+-------+-------+-------+
| 9     | 6 3 7 | 1     |
|   3   |     1 |       |
| 1   7 | 5 2   | 3     |
+-------+-------+-------+
|     4 | 8 1   | 5     |
|   1 9 |       | 8 2   |
|     6 |   5 9 | 4     |
+-------+-------+-------+
|     2 |   4 8 | 7   5 |
|       | 2     |   8   |
|     1 | 7 9 3 |     6 |
+-------+-------+-------+

There are three things a strategy can do when asked to play: remove a mark from a cell, set a cell's value, or nothing.  It's up to the strategy to determine if doing several of them at once is appropriate or not.  And keep in mind that a cell with only one mark remaining is implicitly set to that value, so removing a mark can actually set a cell's value behind the scenes.

For GameRulesStrategy, we're only concerned with removing marks from cells, and we can do many of them in a single pass without stepping on our own toes.  We start by asking the board for all the cells, and then grabbing only the known ones.  Then we iterate over those cells, and for each one get the cell's houses (the row, column, and block the cell is in), flatten them into a single collection, and find all the other cells that aren't know.  Finally we iterate over those cells and remove the mark from each of them, keeping track of whether it actually did anything in the 'madePlay' variable.

If you're interested, this same algorithm can be implemented with for..in loops, instead of .each { } constructs, which is a technique I've used in later strategies where the closures get "in the way" of returning from the method early.  Here's the loop based implementation (minus the tests):

class GameRulesStrategy implements Strategy {

  boolean play(Board board) {
    def madePlay = false
    for (c in board.cells.findAll {
      it.isKnown()
    }) {
      for (it in c.houses.flatten().findAll {
        it != c && ! it.isKnown()
      }) {
        madePlay = it.removeMark(c.value, this) || madePlay
      }
    }
    madePlay
  }

}

If you're wondering, the loop performs marginally faster (<1%) than the iterator, but I find it to be a bit less readable, so I generally prefer the iterator approach.  If performance is paramount, Groovy is probably the wrong choice – it's forte is concise, readable code (and therefore development throughput) not CPU throughput.  People are more expensive than machines in most cases, and certainly this one.

This is a really simple strategy, but there are plenty more complicated ones to come, so stay tuned.

Subclipse 1.6's AWESOME New Commit Dialog

I just upgrade to Subclipse 1.6 and it has an awesome new feature: diffs right in the commit dialog.  It seemed a common use case for myself that when it came time to commit, I'd open my commit dialog and cycle through the files writing my commit message in Kate because I couldn't type while looking at the diffs.  Then I'd paste the message into the commit dialog and hit OK.  With the 1.6 release you can now see your commits right next to your message, and switch back and forth directly.  Kick ASS!

Sudoku is Groovy

Last week I spent a bunch of time implementing Sudoku solving strategies in Groovy.  Actually a pretty interesting exercise, I thought.  Even the simple solving techniques/strategies require a bit of thought to generalize into code.  This might seems like a pointless exercise, but think of it a cross training.  Football players can't hope to compete without lifting weights, and developers can't hope to be at their peak without doing some heavy lifting as well.

I picked Groovy because it let me avoid so much of the boilderplate code you might have with Java, and gives you some really powerful syntactic constructs that are missing from CFML.  Data structure searching/manipulation is the task at hand, and the code reflects that without much extra cruft.  I did run into some issues with aboring .each { } "loops", so had to convert a number of those to for..in loops instead.  I prefer the 'each' syntax, but it's problematic with aborting iteration several levels deep since you don't have the equivalent of a method-return or a labeled-break construct.

I'm going to walk through some of these strategies, but first I'm going to present the general Sudoku framework I built.  It's founded on two main types (Board and Cell) with a number of anciliary types (BoardSpec, House, Row, Column, Block, Chute, Stack and Band), and the actual solving types (Strategy, Solver, TestRunner and Test).  Starting at the top, the "runner" script looks like this:

new TestRunner(new Solver([new GameRulesStrategy()])).test()

That'll create a new Solver with just the GameRulesStrategy, and then supply that to a TestRunner to run the test (using the Strategy's internal test case(s)).  Here's GameRulesStrategy:

class GameRulesStrategy implements Strategy {
  static getTests() {
    [new Test('900637100030001000107520300004810500019000820006059400002048705000200080001793006')]
  }

  boolean play(Board board) {
    def madePlay = false
    board.cells.each { c ->
      if (c.isKnown()) {
        c.houses.each { h ->
          h.findAll {
            ! it.isKnown()
          }.each {
            if (it != c) {
              madePlay = it.removeMark(c.value, this) || madePlay
            }
          }
        }
      }
    }
    madePlay
  }
}

The getTests() method returns a List of Test objects, which contain a BoardSpec and an optional 'check' Closure for when the test has passed (the default is when the board is solved).  The BoardSpec uses a board literal – an 81 character string representing the board's cells starting in the upper left, and moving to the lower right, exactly like reading an English-language page.  1-9 represent themselves, any other character (zeros in this case) represent blank cells.

The solving loop simply iterates over each strategy invoking play(Board) on each one in turn, where a play is either removing a mark from a cell, or setting a cell's value.  Cells with a single mark are automatically promoted to a "known" cell.  Any time a strategy makes a play, the loop starts back at the first strategy.  Itereration continues until every strategy fails to make a play (the test fails), or the board passes the check (usually being solved).

Raw code is available at https://ssl.barneyb.com/svn/barneyb/sudoku/groovy/trunk/ and requires Groovy 1.7 to run.  You should be able to export the code and run 'groovy driver.groovy' and have it.