Project Euler

In the comments to my post about my prime sieve from last week, Duncan pointed me to Project Euler, which is a collection of math problems that you can solve.  Not sure if it makes me crazy that I think it's really cool, but I do.  Even better, once you solve a problem, you gain access to a protected forum thread and a nice PDF writeup discussing the problem and possible solutions.

For example, problem 1 can be solved by a simple loop/modulo algorithm, but that gets expensive as the range increases in size (it's O(n)).  For <1,000 it's quite speedy, but for <1,000,000 it takes a long time.  There is a far better algorithm using a series expansion/summation that operates O(1).  The writeup includes discussion of both algorithms, including code.  Very cool.

If you want to find me out, I'm 'barneyb'.

CFGroovy 1.1 Goodness?

For CFGroovy 1.1, I want to do this sort of thing, where GroovyFactoryBean will instantiate the specified Groovy object and return it, suitable for injecting into an arbitrary CFC of choice:

<bean id="groovyFactory" class="cfgroovy.GroovyFactoryBean">
  <constructor-arg name="cfgroovy"><ref bean="cfgroovy" /></constructor-arg>
</bean>
<bean class="cfcs.service">
  <constructor-arg name="dao">
    <bean factory-bean="groovyFactory" factory-method="getObject">
      <constructor-arg name="clazz"><value>com.barneyb.Dao</value></constructor-arg>
    </bean>
  </constructor-arg>
</bean>

That XML is taken from a working app, but it has all kinds of problems.  In case it's not obvious what's happening, 'cfcs.service' is a CFC whose constructor is passed a Groovy bean (of type 'com.barneyb.Dao').  The Groovy object is instantiated by the 'cfgroovy.GroovyFactoryBean' bean.  I've considered Spring, but I don't want to mix the Spring and ColdSpring dialects, add the extra JARs, or force splitting beans into separate XML files based on language.

Anyone have thoughts on the matter?  And the base ground-rules for CFGroovy, if you don't remember, are as follows (so don't violate them):

  • CTRL-S, ALT-TAB, F5 must work in every case, no container restarts.
  • Code must work on CF8.0.1 and Railo 3.0

CFGroovy 1.0 RC3

This is the final expected RC for CFGrovoy 1.0 and includes a couple more API tweaks (backwards compatible, of course).  Bad form to do this right now, but like the attributes-to-params change, I want to get them in before the 1.0 release so I can build the next round of enhancements without mucking up the API.  Hopefully.  All the changes deal with Hibernate session management, so if you're not using HibernatePlugin, nothing has changed.

From most to least relevant, here are the changes:

  • HibernatePlugin's getSessionForRequest and closeRequest methods are deprecated and replaced with getCurrentSession and closeSession
  • I've added openSession to HibernatePlugin (which always creates a new session) to complement getCurrentSession (which creates on demand).
  • The 'sessionFactoryWrapper' binding has been deprecated and replaced with a new 'sessionFactory' binding.  It still points to the same object (an instance of SessionFactoryWrapper)
  • SessionFactoryWrapper (the internal Groovy class) has had exactly equivalent changes made to it, with the three new methods NOT requiring a pageContext parameter.  Instead, a setPageContext method is exposed for providing it once per request.
  • HibernatePlugin has grown an initializeRequest method that will call SessionFactoryWrapper.setPageContext for you.  Call from onRequestStart or the like.

Just to be clear, the API changes are backwards compatible; existing applications can continue to use the deprecated methods for the time being, but updating to the new API is recommended.  After a couple weeks in the wild I'll tag an release an official 1.0, hopefully exactly matching RC3.

As always, engine source and the demo app (including the engine) are available from Subversion, and full details are on the project page.  I've not bundled a ZIP download of this release since I figure most people will just wait for the official 1.0 release since no bugs were fixed.

Awesomest Code Ever

I found this uncommitted change to a file on one of our shared dev servers today:

<CFIF IsDefined("request.traceactive")><CFELSE><CFSET request.traceactive = "false"/></CFIF>
<CFSET request.traceactive = 'false'/>

I don't even know where to start…

The Groovy Sieve of Barney

Ok, it's really Eratosthenes', but it's my implementation (in Groovy, of course) along with simple display (in CFML, of course), and can be found here (with full source, of course).  If you just want to see the core sieve, here it is ('bound' is the upper range of the search):

class Node {
    def value
    Node next
    Node prev
    def Node(value) {
        this.value = value
    }
    def remove() {
        next?.prev = prev
        prev?.next = next
    }
    def addAfter(value) {
        def n = new Node(value)
        n.next = next
        n.prev = this
        next = n
    }
}
// build the number list
head = new Node(2)
tail = head
if (bound > 2) {
    (3..bound).each{
        tail.next = new Node(it)
        tail.next.prev = tail
        tail = tail.next
    }
}
// start the actual sieve
end = Math.sqrt(bound)
for (curr = head; curr.value <= end; curr = curr.next) {
    for (target = curr.next; target != null; target = target.next) {
        if (target.value % curr.value == 0) {
            target.remove()
        }
    }
}

My initial implementation used an ArrayList for storage just to ensure I had the algorithm correct.  I knew that wouldn't scale because of the abundance of removal operations (and therefore sequential copies), but it's simple.  After that I switched to a linked list implementation using my custom Node class.  This cleaned up the sieve implementation a bit and also made is significantly faster with large sets.  However, I'd expect to see higher memory usage with this version because of the extra pointers.

The example is limited to scanning up to 50K, but with that limitation removed it can scan up to 100k in around 5 seconds, and 500K in about 50 seconds (on my little Celeron server).  The ArrayList implementation took about 11 seconds for 100K, and spun on 500K until I killed it.  Feel free to grab the code and remove the limit on your own hardware if you're curious.

Not really any point to the exercise, but lots of fun to pipe Nightwish into my head for an hour and bang out some "real" code again.  Haven't gotten into pure low-level algorithms like that in a long time.  And Groovy's just fun to write; I never have to wait for my fingers to catch up with my mind because it's so concise.  Whipped up all the CFML first as the main course and the wrote the two implementations for dessert.

Calling FB3Lite as a Custom Tag

Today at work, Joshua wanted to invoke one of our FB3Lite apps from within an external CFM file (a CMS template) to reuse some code.  So I added a little snippet to the top of index.cfm to detect if it's being called as a custom tag and only execute during the start phase.  So you can do this:

<cfimport prefix="myapp" taglib="/path/to/fb3lite/app" />
<myapp:index do="myFuseaction" />

and you'll only get a single execution of the app, instead of one for each half of the custom tag.  This is a piece of functionality that real Fusebox has always had, but which I'd never used and so hadn't incorporated.

As always, the latest is available in Subversion at https://ssl.barneyb.com/svn/barneyb/fb3lite/trunk.

Base Conversion Functions

In the theme of URL shorteners, I whipped up some simple base conversion functions that don't have the hard limit of 36 that the inputBaseN and formatBaseN built-ins do, but otherwise are exact replacements.  Everything is controlled by the first line, the list of digits.  The functions will support whatever base you want, as long as you can supply enough unique digits.  Note that not all characters are valid on the URL, so for URL shortening you have to be somewhat careful what you pick.

The digit list below is 62 long.  You could add underscore, dash, period, plus, exclamation point, asterisk, apostrophe, left paren, right paren and comma to the list with relative safety, bringing the max base up to 72 URL-safe digits.  Depending on your webspace manager, you might be able to add slash and backslash.  And if you don't care about URLs, just shortening something, you can add pretty much anything.

Both functions will return the string "NaN" for an unsupported (too large or too small) base or an invalid number.   This is because I wrote them in CFSCRIPT and when I went to add bound checking I realized there still isn't a 'throw' statement in CFSCRIPT so I punted.  Write to Adobe.

The code should run on any CFML engine that supports CFSCRIPT UDFs.  I tested on either CF8 or Railo 3, but I'm not sure which.

digits = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
function decimalToBase(num, base) {
    var result = "";
    if (base LT 2 OR base GT len(digits)) {
        return "NaN";
    }
    result = mid(digits, num MOD base + 1, 1);
    while (num GT base) {
        num = num / base;
        result = mid(digits, num MOD base + 1, 1) & result;
    }
    return result;
}
function baseToDecimal(num, base) {
    var result = 0;
    var digit = "";
    var pos = "";
    if (base LT 2 OR base GT len(digits)) {
        return "NaN";
    }
    while (len(num) GT 0) {
        digit = left(num, 1);
        num = removeChars(num, 1, 1);
        pos = find(digit, digits);
        if (pos LTE 0) {
            return "NaN";
        }
        result = result + (pos - 1) * base ^ len(num);
    }
    return result;
}

My Real URL Shortening Service

For those of you who didn't get the joke last night, the whole thing was a joke.  While it does do exactly as advertised, it deliberately creates exceptionally long URLs.  In order for the app to actually return a shorter URL than you submit, the original URL must be at least 189 characters long.  I assumed the lunacy of the behaviour and the actual code to implement it would give it away.  Looks like I win the deadpaning contest again.

Here's a real implementation: http://u.barneyb.com/.  Note that the generated URLs aren't actually that short because I'm using a long domain name (13 characters), but if you just run the code on a shorter domain name that problem will go away.  I didn't figure it was worth it to go get a new domain name just for this.

As with the fake app, the full code is below, still organized into the same five files plus an Apache vhost.  It has exactly the same functionality, except that it's now backed with a database for resilience across server restarts.  This makes it actually production-viable for the most part.  Here's the vhost:

<VirtualHost *:80>
    ServerName      u.barneyb.com
    DocumentRoot    /home/www/u
    RewriteEngine   On
    RewriteRule     ^/([a-z0-9]+)$   /forward.cfm?key=$1 [PT]
</VirtualHost>

And here is the CFML code:

Application.cfm:
----------------
<cfapplication name="u" sessionmanagement="false" />
<cfset urlRoot = "http://u.barneyb.com" />
<cfset my.dsn = "urlshrink" />

index.cfm (unchanged):
----------
<cfoutput>
<h1>UrlShrink</h1>
</cfoutput>
<cfinclude template="frm_shrink.cfm" />

frm_shrink.cfm (form action is only change):
---------------
<cfoutput>
<p>To make a shorter URL, just paste your long URL here and hit "Go!".
</p>
<form method="post" action="#urlRoot#/geturl.cfm">
URL:
<input name="url" size="50" />
<input type="submit" value="Go!" />
</form>
</cfoutput>

geturl.cfm:
-----------
<cfquery datasource="#variables.my.dsn#" name="get">
        select id
        from lookup
        where longUrl = <cfqueryparam cfsqltype="cf_sql_varchar" value="#form.url#" />
</cfquery>
<cfif get.recordCount>
        <cfset key = get.id />
        <cfset newUrl = urlRoot & "/" & key />
<cfelse>
        <cftransaction>
        <cfquery datasource="#variables.my.dsn#" name="">
                insert into lookup
                        (longUrl
                        )
                values
                        (<cfqueryparam cfsqltype="cf_sql_varchar" value="#form.url#" />
                        )
        </cfquery>
    <cfquery datasource="#variables.my.dsn#" name="get">
                select last_insert_id() as id
        </cfquery>
        </cftransaction>
        <cfset key = lCase(formatBaseN(get.id, 36)) />
        <cfset newUrl = urlRoot & "/" & key />
</cfif>
<cfset percentSavings = -1 * (len(newUrl) - len(form.url)) / len(form.url) * 100 />
<cfoutput>
<h1>UrlShrink</h1>
<p>Your shortened URL is:
</p>
<p><a href="#newUrl#" style="white-space:nowrap;">#newUrl#</a>
</p>
<p>You've reduced the length of your URL by #numberFormat(percentSavings, ".0")#%!
</p>
<h2>Shrink Another?</h2>
<cfinclude template="frm_shrink.cfm" />
</cfoutput>

forward.cfm:
------------
<cfparam name="url.key" default="" />
<cfquery datasource="#variables.my.dsn#" name="get">
        select longUrl
        from lookup
        where id = <cfqueryparam cfsqltype="cf_sql_integer" value="#inputBaseN(url.key, 36)#" />
</cfquery>
<cfif get.recordCount GT 0>
        <cflocation url="#get.longUrl#" addtoken="false" />
<cfelse>
        <cfoutput>
        <h1>UrlShrink</h1>
        <p>The URL you're visiting isn't recognized.  Perhaps you'd like to shrink your own?
        </p>
        <cfinclude template="frm_shrink.cfm" />
        </cfoutput>
</cfif>

Finally, because it is now database backed, you need a CF DSN named "urlshrink" pointed to a database with this table.  The CREATE TABLE is from MySQL, but the in-code SQL should be cross platform (though I haven't tested it):

CREATE TABLE lookup (
  id int(10) unsigned NOT NULL auto_increment,
  longUrl text,
  PRIMARY KEY  (`id`),
  KEY `k_longUrl` (`longUrl`(255)) -- to speed stable url queries
)

Total code size has grown to 1,993 bytes (92 more than the joke code), almost entirely due to CFQUERY and CFQUERYPARAM.  There are also a lot more lines of code because of the way I write my SQL, which also contributes to increased bytes from leading tab characters.

My New URL Shortening Service

Joshua, Koen and I were discussing URL shortening services (tinyurl, bit.ly, is.gd, etc.) over lunch this week, and so I decided to write my own (UrlShrink) available here: http://www.barneyb.com/applications/urlshrink/app/

It's ludicriously simple.  I opted for memory storage (think Mailinator) instead of a database for performance reasons.  With the load I anticipate, I think it'll be a deciding factor.  I've also included stable URL generation (which I don't know that any other service does), so if you try to shorten the same URL repeatedly, you'll get the same result every time.  As well as being more user friendly, this further increases performance because it prevents artificial bloat to the keyspace, thereby making searches marginally faster.

The app is all CFML, aside from an Alias and couple RewriteRules to make the URLs pretty.  Here are the rules:

Alias   /applications/urlshrink/app /home/www/urlshrink
RewriteRule ^(/applications/urlshrink/app)/geturl   $1/geturl.cfm   [PT]
RewriteRule ^(/applications/urlshrink/app)/shorturl/([^/]*) $1/forward.cfm?key=$2   [PT]

And here's the CFML (in five files) in case anyone else is interested in running their own service:

Application.cfm:
----------------
<cfapplication name="urlshrink" sessionmanagement="false" />
<cfset urlRoot = "http://www.barneyb.com/applications/urlshrink/app" />
<cfif NOT structKeyExists(application, "forwardLookup")>
        <cfset application.forwardLookup = {} />
        <cfset application.reverseLookup = {} />
</cfif>

index.cfm:
----------
<cfoutput>
<h1>UrlShrink</h1>
</cfoutput>
<cfinclude template="frm_shrink.cfm" />

frm_shrink.cfm:
---------------
<cfoutput>
<p>To make a shorter URL, just paste your long URL here and hit "Go!".
</p>
<form method="post" action="#urlRoot#/geturl">
URL:
<input name="url" size="50" />
<input type="submit" value="Go!" />
</form>
</cfoutput>

geturl.cfm:
-----------
<cfif structKeyExists(application.reverseLookup, form.url)>
        <cfset key = application.reverseLookup[form.url] />
        <cfset newUrl = urlRoot & "/shorturl/" & key />
<cfelse>
        <cfset key = hash(form.url, "SHA-256") />
        <cfset newUrl = urlRoot & "/shorturl/" & key />
        <cfif len(form.url) GT len(newUrl)>
                <cfset key = hash(form.url, "SHA-512") />
                <cfset newUrl = urlRoot & "/shorturl/" & key />
        </cfif>
        <cfset application.forwardLookup[key] = form.url />
        <cfset application.reverseLookup[form.url] = key />
</cfif>
<cfset percentSavings = -1 * (len(newUrl) - len(form.url)) / len(form.url) * 100 />
<cfoutput>
<h1>UrlShrink</h1>
<p>Your shortened URL is:
</p>
<p><a href="#newUrl#" style="white-space:nowrap;">#newUrl#</a>
</p>
<p>You've reduced the length of your URL by an exceptional #numberFormat(percentSavings, ".0")#%!
</p>
<h2>Shrink Another?</h2>
<cfinclude template="frm_shrink.cfm" />
</cfoutput>

forward.cfm:
------------
<cfparam name="url.key" default="" />
<cfif structKeyExists(url, "key") AND structKeyExists(application.forwardLookup, url.key)>
        <cflocation url="#application.forwardLookup[url.key]#" addtoken="false" />
<cfelse>
        <cfoutput>
        <h1>UrlShrink</h1>
        <p>The URL you're visiting isn't recognized.  Perhaps you'd like to shrink your own?
        </p>
        <cfinclude template="frm_shrink.cfm" />
        </cfoutput>
</cfif>

Total size is 1,901 bytes; that's pretty respectable for such a full-featured app.

What is The Matrix?

It's control. And I've decided that the control is irrelevant. There are exactly three people in my life that I'll relinquish control to. Two of them don't yet reach my waist (that'll change), the other one can nearly look me in the eye (especially in heels). The first two pretty do, but they can't have full control anyway, they're too short. I'm cryptic, but yet it's doesn't really matter; my blog, my rules. I'm done fighting. I'm not surrendering, just standing aside from a battle that isn't worth it. I'll suffer, I know that from experience, but I'll suffer less.

On a completely tangential note, I like semicolons. The use of a conjunction, either in the middle of a sentance or to start a new sentence just doesn't have the same effect as a semicolon. The semicolon doesn't require the extra word, and I think leaving it out better represents the spoken word where a slight pause would be used, not an extra sound or three. I once read that a competent writer (of English) uses a semicolon about once per printed page. That seems like a shortage to me with it's inherent elegance, but it also seems about the frequency I use them. And no, it's not because I'm aiming at this perceived ideal. Notice the use of a sentence-starting conjunction. A semicolon would have been disasterous in that situation.

Wait for it…..

"this Barney guy is friggin' crazy"

There we go.

Surely others appreciate the delicacies of the English language as much as I do? Because they are delicacies, just underappreciated ones. Not to insinuate I never write the equivalent of frozen pizza, but when it flows……. mmmmmmm.