Calculating Levenshtein distance for fun and profit

April 16th, 2012 § Comments Off on Calculating Levenshtein distance for fun and profit § permalink

JavaScript has some pretty weak tools for strings. What method would you use to compare simliar-but-different strings? Say, for example, you have a signup form and when someone is typing in their email address you want to be helpful and check to see if they may have typed “gmial.com” instead of “gmail.com” or “em.com” instead of “me.com”. I give you the Levenshtein distance!

The Levenshtein distance is an oft-used algorithm for determining the similarity of two strings, measured in the number of changes you’d have to make to one string to make it into the other. So strings a distance of 5 would require five letter changes to be the same, a distance of 7 would need seven edits and so on.

For reference, I have used Carlos Rodrigues’ implementation of the Levenshtein distance in previous projects with great success. Yes, it is attached to the String prototype. No, I don’t have a problem with that, nor should you. Arguments about prototype purity will be saved for another day.

Levenshtein, how does it work?

In practice finding the distance between two strings is easy. Just do this:

var barDiff = 'bar'.levenshtein('baz');
console.log('difference: ' + barDiff);
//Outputs: "difference: 1"
var jsDiff = 'javascript'.levenshtein('JavaScript');
console.log('difference: ' + jsDiff);
//Outputs: "difference: 2"

Did you see that? Yes, Levenshtein is a case-sensitive comparison, just like the rest of JavaScript. Keep that in mind and sanitize/mutate your comparisons accordingly.

That’s it! Take this information and go write something clever and useful for your users. Also, keep an eye on my GitHub account for a Backbone extension that will automatically suggest proper email provider spellings for you users.

Don’t tell me what to focus on, I’ll tell you!

October 27th, 2011 § Comments Off on Don’t tell me what to focus on, I’ll tell you! § permalink

After a frustrating interaction with a form I whipped up a quick jQuery plugin for developers who don’t want to think about form field focus.

I present to you…

Polite Focus!

It’s a simple idea – use document.activeElement to determine if an existing field has focus. If not, give your field focus. Otherwise, leave the focus where it is, where someone else (user or previous programmer) decided it should be.

That’s it.

Go forth and use wisely

I’ve created a repo on GitHub so you kids can branch and comment as much as you like.

Go clone the plugin on GitHub. Now.

Using document fragment when rendering a backbone view

October 27th, 2011 § 1 comment § permalink

Backbone.js is pretty fun to use. Right out of the box it makes app development quick and organized. Fully grokking how to adapt to using view controllers is a bit of a change and the event system is happily powerful, yet blending it with jQuery events doesn’t seem immediately obvious. All the same it does a great job of giving great tools for organization and structure while staying out of your way.

One drawback to using a library that does magic is that if you’re not careful you can miss some important details. In this case I’m talking about injecting your views into the DOM. Blindly following the create view/render into element pattern as described in the documentation is usually fine. However, if you are using a view to render a list of children the $(this.el).append(view.render().el) quickly becomes costly.

For my needs I wanted to add an arbitrarily-long list of elements to an ordered list. Instead of blindly looping over models, creating views and shoving them into the list one at a time I decided to add all rendered views into a document fragment in memory and then append that fragment on to my list. Clean code for rendering, only one DOM adjustment for speed.

Example Code!

var containerView = Backbone.View.extend({
//... snip
	render: function() {
		var template = _.template($('#regular_ol_template').html()),
			self = this;

		$(self.el).html(template(self.model.toJSON()));

		self.list = $('.list', self.el);

		var frag = document.createDocumentFragment();
		_.each(self.model.things.models, function(thing) {
			frag.appendChild(self.createThing.call(this, question).render().el);
		});
		self.list.append(frag);

		return self;
	},

	createThing: function(thing) {
		return new ThingView({model:thing});
	}
//... snip
});

But why make a separate function?

What you’re not seeing is another method named addThing which uses createThing and does more work – to my model, spinning up other views, etcetera. It’s also good not to bind the creation of elements solely into your render method. You want to be able to add more things to your list without fully re-rendering it. That way, should someone else decide to use my model or view in the app, elements in the list can be adjusted without the performance overhead of creating the whole list again.

Result

There you have it! You have now rendered your Backbone.js views into memory and appended them to the DOM in one fell swoop. Instead of abusing the DOM in your loop cycle you’ve speedily created them in memory. Isn’t that much kinder to the DOM? The answer is yes, yes it is.

What do you think? How would you change this? Comments!

Accessing the browser’s stylesheets with CSSelection

October 20th, 2010 § 3 comments § permalink

When writing applications that modify the DOM there’s one thing to keep in mind: your code will not be the only code touching the DOM. Defensive code is good code.

With defensive programming in mind there sometimes exists a need to peek into the CSS that is loaded into the browser. Say an application your writing depends on user-based styles that are dynamically loaded in. It’d be nice to inject those styles into the document CSS and know you’re not overwriting anything. Perhaps your code is meant to take an existing style and modify it.

To this end I present a work-in-progress of my first jQuery plugin, CSSelection.

Code: View CSSelection on GitHub

Download jquery.CSSelection.js

CSSelection start by accepting a jQuery selector. With that selector you can read the current styles applied to that selector, add new styles, create such a selector if one does not exist or even remove the current selector from the stylesheet. I tried to keep the interface predictable to jQuery users at the expense of a bit of extra code. Passing no arguments retrieves the current rules, if any, of the selector. Passing the string “remove” does as expected to first found matching selector in the style sheets. Finally passing an object literal with rules will either add them to the stylesheet or modify the matching selector.

While this code is used in production I’ll still call it beta as bugs are cropping up in IE9 and are sure to arrive in future revs of Safari and Firefox. Enjoy!

Usage examples

How to create a rule for HTML elements:

$('p').CSSelection({rules: {'font-size': '120%', 'color': 'red', 'background-color': '#fff'}});

Modifying existing CSS rule/add new rule:

$('.important').CSSelection({rules: {'text-decoration': 'underline overline', 'background': 'yellow'}});

Delete a CSS rule:

$('.annoyingDecoration').CSSeleciton('remove');

Get attributes for existing rule:

var attrs = $('.wickedAwesomeClass').CSSelection();

The subtle art of ranges – Intro

September 21st, 2010 § Comments Off on The subtle art of ranges – Intro § permalink

You’re starting to see it more and more. You select some text on a website and — woosh — something happens. Sometimes you expect the action. Oftentimes you don’t. Regardless, these things are possible due to DOM ranges.

And DOM ranges are terrible.

I don’t mean they are a terrible idea. I mean they’re terrible to work with. Of course there is no single, standard API to work with. Of course Internet Explorer adds another level of challenge. And of course you’re going to at some point have to know how these things work if you expect to be a JavaScript ninja.

There are many reasons to deal with ranges. Some lame services will hijack your selection and inject in their own tags. If you’ve ever needed to hack around a WYSWYG editor then ranges will be on the docket. Or say you’re leading a team that allows users to select text on a page and do things like highlight it, make a note attached to it or even place a bookmark at that point in the page. That’d be my job.

The default range object seems good enough for a lot of tasks. Where it starts to show its weakness is interacting with the DOM outside of just the range.

You want to take what a user has selected and wrap it in a decorative <span>? That should be easy, right? What happens if that selection starts in the middle of an <li>, moves down a few paragraphs, enters a <div> and ends a few levels deeper in the node tree? Well, you can’t just wrap that in a single element.

The browsers all try to be helpful. Really, they do. If your range starts in the middle of an <em> tag but ends outside of it the range you get back from the browser will automatically split the tag properly if you wrap it. That’s nice. But if you remove that wrapper now you have a split tag that equates to more child nodes in memory than in code. That makes for bad assumptions later.

During my research into and development of this project I’ve found a stunning lack of comprehensive information (much less example code) for DOM ranges. Obviously PPK is the place to start but his level of detail is light and — at best — beginner. I’ve worked with some guys at Google who are on the Closure Library team and are ridiculously smart. Even they didn’t plan for the basic needs I have. Or it could be I’m doing it wrong.

In any case, I’ll be working on a series of entries to this blog that will go through what I’ve learned, show some code, talk about common pitfalls, propose some best practices and hopefully shine some light on the dark corner that is DOM ranges.

Making friends with the DOM

November 19th, 2009 § Comments Off on Making friends with the DOM § permalink

Any JavaScript developer who spends more than a blink of an eye working with the DOM has probably come despise the API. It’s not that there’s a dearth of features (welll….); I think it’s the verbosity that causes modern application development woes. For the one-off script that adds an onclick or modifies some text the DOM API is fine and dandy. Try scaling up to developing an application or, perhaps, a framework and things get annoying and messy right quick.

When Tommy finally gave jQuery a try and discovered there was no native DOM manipulation tools he decided to roll his own plug-in he calls FluentDom. The code is clean and quality, as I’d expect from him, but the interface drives me bonkers.

I’m a fan of configuration over convention, a concept which, when combined with my MooTools history, has lead me to favor using object literals as configuration “objects”. When I use MooTools to create a new element the only arguments are the element type and an object literal with any configuration options. This is where Tommy’s code style differs. In FluentDom you set a specific attribute with specific methods with calls able to be chained together. This isn’t to say that FluentDom couldn’t be used with a configuration object, it’s just not as clean an implementation as I prefer.

In a message to him I said this:

The benefit to setting attributes via an object literal is that I can use one set of preset attribute values and pass that around. Quite handy if you’re doing something like iterating and creating a list of similar elements but particular ones might have subtle differences. Also easier to maintain and extend. Given your code it’d be a trivial modification to do this during a create call.

Which do readers prefer, specific, chainable method calls or ambiguous configuration object?

Deleting dynamically generated elements

November 11th, 2009 § Comments Off on Deleting dynamically generated elements § permalink

A coworker, Tommy, posed an interesting question to me the other day. How would one handle dynamically adding elements to a form, especially with regard to giving the user ability to remove those elements? We both admitted to implementing this a few times and he described to me his method of iterating through elements until the right node was found and then removing it.

As a habit I avoid DOM traversals whenever possible so I suggested an alternate solution. Instead of having a controller function look for some kind of unique ID or scan the DOM why not break down the control to the individual element and programmatically make the removing function a closure with a reference to each node. That way the code to remove an element is only ever concerned with its own instance. No crawling the DOM, no messy lookups, no muss no fuss.

If that didn’t make sense, perhaps this snippet will.

var addControls = function(element)
{
	var remove = document.createElement('input');
	remove.setAttribute('type', 'button');
	remove.setAttribute('value', '-');
	remove.onclick = function()
	{
		return function(self)
		{
			if(confirm('Are you sure?'))
				self.parentNode.removeChild(self);
		}(element);
	}
	
	element.appendChild(remove);
}

addControls(someNodeThatYouAlreadyFound);

Pretty simple. Questions? Let em rip in the comments.

Getting back to my roots

October 29th, 2009 § 1 comment § permalink

Or: There are legit reasons to write your own library

Around the office it’s no secret that I’m a fan of the MooTools JavaScript library. For those of you who can remember back before jQuery, Moo.fx (from whence MooTools came) was a tiny, feature rich library for animations and effects. It was a welcome relief from having to write effects and deal with the cross-browser issues that invariably crop up. Compared to the heavy weight (and necessity) of including both the Scriptaculous and Prototype libraries, Moo.fx was a boon for byte-pinching developers and infrastructure managers.

Over the years as many libraries popped up and fell by the wayside, jQuery came along and showed us the difference between a library and a framework. Perhaps the most important result of competition between libraries was performance. I won’t hold MooTools above any other library arbitrarily but when they created SlickSpeed it was a great way to show performance between libraries and let developers decide what library best fit needs. John Resig came along and blasted out Sizzle and – BAM – kicked up the speed of jQuery by a wide margin. The healthy competition rages on and it just means better speed, compatibility and features for users of JavaScript libraries.

With the availability of tested, proven, mature libraries why would anybody ever bother writing their own native JavaScript code any more? I know that the last time I write document.getElementById() can’t come soon enough. Still, there are reasons to stick to pure, native JavaScript.

For starters, no matter how elegant or optimal the code a library’s code will never be as efficient as native JavaScript. Sure, they may delegate to native functions and the cost could end up being a millisecond or two. That’s fine. Then again, do you want to be the one tracing back performance issues when your code causes a locked browser when you run var header = $('header'); and hit some kind of weird edge case?

I had a better reason than performance. I’m a realist. I don’t have the time or patience to write the fastest code in the west. I write code to solve solutions. Functionality first, optimization second. That doesn’t mean that I write slow or hacky code. I just want to get the problem solved first and sometimes it’s easier to write something in straight JavaScript instead of finding a MooTools class or a jQuery plugin, making sure it doesn’t pollute the namespace, assuring that it runs reliably and trying to configure it for my specific needs, even though it was never meant for that.

Sadly, it has been a few years since I’ve been able to really stretch my JavaScript legs and run. From a CTO who banned the use – and all but the mention of – AJAX to projects that were more business rules than user interface I haven’t had just cause to go nuts on the client side. There have been occasional forays into widgets and code snippets but nothing serious. Any real geek should realize that a dearth of reasons to code passionately causes bad things to happen in a programmer’s brain.

The straw that broke the camel’s back was a manager walking over to me, asking me if JavaScript could do XYZ, to which I boldly assured him it could. I sat down, cracked open a fresh TextMate skeleton project and stared blankly at the screen. I had no idea how to accomplish the task, where to look for guidance or even if I could get it done. This could not stand. I needed a core JavaScript refresher. More than that, I needed to cover topics I had never mastered like, say, wholly p0wning the DOM.

It was at that moment that I decided to write my own JavaScript library. Not to have more features. Not to be the fastest. Not to be the most popular. Simply to better understand the language and environment in which it is most often run. A refresher on what I know, a chance to learn new tricks and an opportunity to better learn how to architect something bigger than a snippet.

Things are progressing nicely and when there is a stable base and interface I plan on releasing the code open and free. I doubt it will blow the minds of any seasoned JavaScript experts but already I have some tricks that I haven’t yet found in either MootTools or jQuery. It has been frustrating, fun and frivolous – all the best attributes of any personal code project.

I’ll be clear: I do not recommend writing your own library, framework or other massive code base. Not on a whim, at least. It’s not worth the time to do something that is only half as good as what already exists.

However, you cannot call yourself an expert at anything without knowing how the guts work. Knowing how fuel injection works does not make me a champion race car driver.

And that’s how I’ll end this chapter. JavaScript programmers are not race car drivers.

What isn’t in a name?

June 16th, 2009 § Comments Off on What isn’t in a name? § permalink

If you don’t already know, polluting the global variable pool is a bad, evil thing in JavaScript land. If you’re not concerned with overwriting existing code then at least think about protecting your code from being overwritten by keeping it in your own namespace. All you need to create a namespace in JavaScript is an object literal.

Say you’re tasked with incorporating a third-party widget into your home page. You drop in their include but suddenly one of the scripts on your site breaks. You haven’t changed your code so you assume it’s the include causing trouble. However, a JSLint shows that their syntax checks out fine. Looking through your functions you narrow the misbehaving lines down to the following:

function findElement(el) {
    return document.getElementById(el);
}

function updateNews(content) {
    findElement('news').innerHTML=content;
}

That is some horrible code. Worse, that’s horrible code in the global pool. Cracking open the third-party include you see this:

updateNews = function(container,source,freq)
{
    // Snip....
}

My, that function name looks awfully familiar. In fact, it’s the same as yours. That means your code and the third-party code are fighting in the global variable pool. What’s a clever programmer to do? Do what works in other languages, of course!

Time to carve out your own namespace

This is ridiculously easy if you understand what object literals are and how to self-invoke an anonymous function. Those terms both sound scarier than they are. You simply create an object literal to act as your namespace &#8211 which is a bucket for all your code – then use anonymous functions to assign functions into it. Keeping with the theme of being a polite player in the global pool also make sure your namespace is either brand new or extends any existing namespace.

var AY = AY || {}; //Don't overwrite existing namespaces
AY.News = function() //A 'News' bucket for all news-related tasks
{
    //The 'var' forces the findElement method 
    //into the News scope, not global
    var findElement = function(el)
    {
        return document.getElementById(el);
    }
    
    //Anything in the return becomes publicly accessible, yet 
    //is also able to access private methods and variables
    return {
        updateNews: function(content)
        {
            findElement('news').innerHTML = content;
        }
    }
}(); //Self-invoke to make this available immediately
AY.News.updateNews('Hai');

What’s happening here? Well instead of dropping all of your code into the global you can stash it in a single namespace. In my case I’ve named my namespace as AY, after my name. I used all capital letters to signify that this is not a variable or method belonging to any other code. For news-related items I’ve created a News bucket.

Inside of my News bucket there are a few things happening. I’ve decided that findElement should be a private method in my namespace so by prepending it with the var = I designate it as existing only inside the AY.News namespace. Without a public accessor no outside code can run that method. My return passes an object literal which houses any public methods, in this case the updateNews method. Because they are a part of my AY.News namespace these public methods can access private methods, which is why the updateNews code still works.

Finally, the closing parentheses cause the anonymous function to automatically run, or self-invoke. This action executes the function and, during its execution, hits the return which is what causes the return data to be assigned to AY.News. The return is an object literal itself that contains a reference to the “private” updateNews, which will execute the update.

That’s it! There’s no more to it. I’ll grant that for personal scripts this technique is not absolutely necessary but if you ever want your code to play nicely in another environment it’s a good practice and makes other programmers less hesitant to use your code. Comments regarding execution of this technique are more than welcome. This is my personal flavor and because of JavaScript’s flexibility I’m sure there are some other great techniques out there. Comment away!

Recommended reading:

2009-06-17: Reordered and cleaned up copy.

Yet another article about closures

June 7th, 2009 § Comments Off on Yet another article about closures § permalink

…or when variables get too friendly and cause trouble

JavaScript is a functional language. Because of its functional nature you can do really fun things that may not be readily apparent. One of my favorite things about JavaScript is its implementation of closures, a function complete with its own private scope. Why would that ever be useful? It helps get around one of my least favorite things about JavaScript: scope. A closure is a function that, while it has access to globally-scoped items, has its own, bound local scope. You know those times when you only want a variable to apply to a certain event? No? How about an example?

I bet at first glance you’ll be able to tell me what this bit of code is supposed to do, but not why it doesn’t work as expected.

function addLinks()
{
    for (var i=0, link; i&lt;5; i++)
    {
        link = document.createElement("a");
        link.innerHTML = "Link " + i;
        link.onclick = function ()
        {
            alert(i);
        };
        document.body.appendChild(link);
    }
}
window.onload = addLinks;

To be clear, this is a small function that will add five links to the page that read Link 0, Link 1, Link 2, etc. and, when clicked, will pop up an alert box with that link’s number, such as 0, 1, 2, etc. I know appending to the DOM inside a loop is bad practice and that these links are not inside of a block-level entity. For now pretend that these things aren’t important. What is important is that clicking on any of these links will always alert 4 instead of the number belonging to that link.

Global scope is the enemy of predictability

As the loop iterates, the variable i is being incremented and that number is applied to the individual links as they are made – so what’s happening here? JavaScript’s ultra-friendly, as-global-as-possible scope is stealthily making trouble. For example, if you were to break that loop when i was set to 3, for example, all links would alert 3.

You see it yet? Exactly. The variable i is set in the global sense of the page and each link is referring to the global instance of i, not its own local scope. So if I were to run this function then type javascript:i='foo'; in my address bar each link would alert foo. You’re right to think there has to be a way around this.

Bringing them together

So you have a function using a variable but the scope is out of whack? Let’s see if a closure can help! The trick will be to create a function that only knows what you tell it and make that function what gets called on the onclick. If that doesn’t make sense, read it again and then look at the code. The syntax and execution of closures is fairly counter to most programming techniques so putting yourself in that mindset will take some adjustment. A bit of updated code should help.

function addLinks()
{
    for (var i=0, link; i<5; i++)
    {
        link = document.createElement("a");
        link.innerHTML = "Link " + i;
        link.onclick = function (num)
        {
            return function ()
            {
                alert(num);
            };
        }(i);
        document.body.appendChild(link);
    }
}
window.onload = addLinks;

The onclick is now bestowed with a closure and that closure is binding the i‘s value into its own local scope so that you can change i to be whatever you like and the closure will always remember what it was when you called it. The closure is invoked by the (i) at the end of the function, making it a self-invoked function. When it is invoked the i is passed in, assigned to the num variable which makes it a local, contained variable. Then an anonymous function is returned that will be called when the onclick is triggered. That anonymous function doesn’t have its own num variable so it looks up the scope chain, sees one in the onclick and uses it.

That last paragraph might need a few passes to fully understand. Don’t worry, closures are a complicated concept to learn for programmers not well versed in functional languages. In upcoming entries I’ll talk about stretching JavaScript in two different but powerful ways: lambda functions and exploring public and private variables. Stay tuned!

Recommend reading: