Do Rewrite Yourself

Written by Vincent Bruijn

I decided, in advance of _If Hemingway Wrote Javascript_, to start rewriting parts of my own code. I was delighted to see the variaty of implementations of algorithms within the book. Wether it is a good or bad part of JavaScript that anything can be written in different ways, why then settle with the first result of my own coding? Why not give it another take?

I have to admit, I just started out. It can be a bit tedious to rewrite something that is already working. But I just have to repeat to myself that I do it because I want to better understand the language, and that helps.

Last week, for my #NaNoGenMo submission, I wanted to have a similar function in JavaScript as PHP’s wordwrap(). It’s a string manipulation function with limited use, I think, and it surprises my why it ended up as part of the standard string manipulation library. But, as with a lot of programming languages: You’d better try to live with its quirks. Anyway, wordwrap() can take up to four parameters as of which the first one is the only required one. I made up a basic implementation and ran some basic tests, which seemed acceptable, see code below.

function wordwrap(string, width, lineBreak, cut) {
  if (!string) {
    return false;
  }
  var words = string.split(' ').reverse();

  var lineWidth = width || 75;
  var lBreak = lineBreak || '\n';
  var doCut = cut || false;

  var word;
  var result = '';
  var line = '';

  while (words.length) {
    word = words.pop();
    if ((line + word).length >= lineWidth) {
      if (word.length > lineWidth && doCut === true) {
        words.push(word.substring(lineWidth));
        word = word.substring(0, lineWidth);
      }
      result += line.trim() + lBreak;
      line = '';
    }
    line += word + ' ';
    if (!words.length) {
      result += line.trim();
    }
  }
  return result;
}

What I like about the above way is the use of an Array, and to have a reliable length of the string, which is reduced by popping off elements of the array. The tricky part comes in when the fourth parameter is given: to cut or not to cut a long word. The basic implementation was quite easy and I started messing around a bit with adding the fourth parameter.

// Example from PHP website, but in JS:
var string = 'A very long woooooooooooord.';
var result = wordwrap(string, 8, '\n', true);

console.log(result);
// Renders:
//
// A very
// long
// wooooooo
// ooooord.

Then after a few days I came up with this idea to start rewriting things I coded. As said, I want to learn from my own way of coding, and my target is to get to know JavaScript better.

function wordwrap2(string, width, lineBreak, cut) {
  if (!string) {
    return false;
  }
  var lineWidth = width || 75;
  var lBreak = lineBreak || '\n';
  var doCut = cut || false;

  var result = '';
  var sub;
  var lastSpaceIndex;

  while (string.length) {
    if (string.length < lineWidth) {
      result += string;
      break;
    }
    sub = string.substring(0, lineWidth);

    lastSpaceIndex = sub.lastIndexOf(' ');
    string = string.substring(lastSpaceIndex + 1 || lineWidth);

    sub = sub.substring(0, lastSpaceIndex) || sub;
    result += sub + (doCut === false && lastSpaceIndex === -1 ? '' : lBreak);
  }

  return result;
}

The second implementation uses a while-loop too, but doesn’t use any arrays, just plain string manipulation. I chose to let the main string shrink during each iteration, so I could rely on its length in the while-loop. Again I hacked the cut parameter in it by doing some trial and error, but alas, it worked.

What have I learned from this? It appeared that running both implementations against a 50k+ words text, the latter takes about 25ms, while the former about 38ms. So my first implementation is slower. I didn’t dig into it in detail but I assume it’s because of the array manipulations, while the second one only modifies strings.

What more? I think I stepped into the same pitfall twice: my approach was to get the first three parameters implemented first, and the fourth one at last. It feels as if it is hacked in afterwards too much. I need to work on taking all sides of the algorithm in consideration in the first place, and then write code!