Wednesday, March 19, 2014

It actually makes sense to implement your own sort

My mantra with algorithms is usually "somebody has probably already done it better." Why implement a sort algorithm if one already exists in your language?

With that it mind, it was a mere academic endeavor to implement Quicksort in JavaScript. I tried populating an array with a million random numbers, fully expecting Array.sort() to blow away my custom quicksort.

But guess what?

It didn't.

I ran some tests in node.js, and Array.sort() always came in last.
Array.sort() converts numbers to strings and then sorts them. For string sorting, it would probably blow away anything you could implement in JavaScript. However, for numbers, a custom sort wins.

Here are the times (in milliseconds) for sorting 1,000,000 integers

array sort (function(a,b) {return a-b;}): 518
quickSort: 235
array sort: 930

The native Array.sort() is the slowest. It returns invalid results with 999 showing up near 9996 since it is a text rather than numeric sort.

Ok, so if you are going to be sorting a huge number of integers, making your own quicksort is the way to go. But if you are looking at strings, the native sort is the way to go, right? Alas, no. For the next sample, I created 1,000,000 strings of 13 random uppercase characters and ran them through our three different scenarios. The quicksort could be used as is (the comparisons work on strings just as well.) The named sort function needed to modified slightly to handle the new sort:


function(a,b) {
if(a<b) { return -1; }
if(b<a) { return 1 };
return 0;
}


In this case, I'm just using capital letters. If there other characters involved (such as umlauts or accents), localeCompare should be used to get more appropriate results.

text sort of text array: 6625
quicksort sort of text array: 4091
named function sort of text array: 5007

Again, quicksort was the fastest, a custom function came in second, and the native sort was last. The percentage difference was not as great as with the integer sort, but the overall time difference was even greater.

The moral of this story: In most cases you will want to use your own sort function for both speed and accuracy. If you are sorting a small number of items, simply plugging a compare function into sort is probably all you need. If there are a large number of items, it would pay to implement your own sort code.

No comments:

Post a Comment