by Pedro Teixeira

Posts Tagged: javascript

Transcript:

Some two and a half years ago a friend told me about Node.js. “It’s JavaScript on the server, and it’s making a buzz, you should totally check it out”. At the time I was doing Ruby dev work. I knew JavaScript, I kind of liked it, so I gave it a try.

Downloaded node, installed it, did my first “hello world” script in 20 seconds, and bam! I had coded an HTTP server. No fluff code, no boilerplate. And it really performed well! I was immediately hooked.

I submitted a talk proposal about it for the following Codebits conference (which is an amazing conference that is held in Lisbon about all things computer-related). It was about Node internals. I had played with event-driven systems using Event Machine in Ruby land, and I wanted to know more about how it worked internally. I learned about the event loop, ticks, callback dispatching, asynchronous I/O, the thread pool, the works. The talk got accepted and I kept learning.

At the time I didn’t have any devent source of learning material, the best thing around were the Node API docs, which were a bit scarce - I found myself digging into the source code a lot. I then started doing some screencast tutorials, and I started each one with the desire to learn about a certain subject.

Then a friend of mine challendged me to write something about Node.js in the form of a book. I followed the lead and started doing just that. Some frantic writing and some long nights after I had a english-challenged piece of text that I started selling.

Then I decided to go to jsconf.eu. I had watched almost all the available video material, all the epic moments: Douglas Crockford saying IE must die, the Socket.io presentation, Ryan Dahl announcing Node.js. I wanted to experience it.

I arrived at the conference, and just before the first talk I introduced myself to Marak Squires from Nodejitsu and asked him something something about his hook.io project. Immediately he said “let’s go grab a coffee and I’ll just show how to do it in my laptop”. And he did, just then and there. I continuously kept having great conversations with great people in the community. The talks were amazing too, track A and track B was great. And sometimes I just remained talking with some people in “track C”. And then out for drinks. Not much sleep. And then more talks.

Wfter I came back home I just kept talking about the whole experience to my family and friends over and over.

I was simply amazed by the friendlyness and talent of the JavaScript community.

Traveling and going to conferences is expensive. “I wish we had something like this back in Portugal” - I thought.

Me, Nuno, Tiago, Bruno, Heitor and Luis started putting this conference together, and quickly learned that putting conferences is two things: hard and expensive. We’ve learned a lot and will most surely learn a lot during the days that lead to the conference, but we want to put out a great experience for the attendees, the speakers and the sponsors.

We’ve got a bunch of great speakers lined up, most of them coming from far away places just to speak to us. If you’re in Portugal you shouldn’t miss this. If you’re not in Portugal, you should come here. Lisbon is such a great city, and again, the speakers we have lined up are amazing.

Text

Banzai is a document processing framework for Node.js.

You define a set of pipelines into which you push documents. Each document in a pipeline has a given state. A state transition triggers a state entry handler that can transform the document and interact with the outside world. The documents ends in a defined or in an “error” state.

Rollback and Playback

You can roll-back the state of a document to a certain previous state, and playback the pipeline flow. This can be useful, for instance, if a given document enters an error state because of a bug or a networking problem somewhere. You can correct the bug, roll-back to a previous state and play the pipeline from thereon, hopefully escaping that error condition.

Each state transition has a “next state”, a priority and a optional pre-condition. The candidate transitions (there can be more than one) are evaluated in the given priority, and if there is a pre-condition, it is evaluated, and if there is a match, the corresponding state transition handler is triggered.

Each state transition can have an “undo handler”, that takes care of undoing the document. This can be useful if external services were changed and you need to revert those changes when you revert a transition.

All JavaScript

The state transition handlers and the pre-conditions are all defined in JavaScript and are asynchronous, meaning that you can perform I/O inside them. The pipeline definition is also written in JavaScript.

Architecture

A Banzai deployment has 4 main components: the document store, the state store, the workers and the work queue.

The Document Store

The document store is where - you guessed it - the documents are stored. The document is retrieved when entering a state transition, passed into the state transition handler and saved when the handler is done. This way a state transition can be picked by any worker and the document is always persisted, surviving failures.

State Store

The state store is where the state for each document transitioning or that has transitioned a pipeline is stored. There you can also find some additional meta-data, like all the transitions that occurred and their start and end times plus some meta-data that the state transitions can save.

Workers

Workers are processes that are listening for state transitions and that pick up the work of invoking the state transition handler and deciding the next state.

Work Queue

The Work Queue is an event queue that persists and distributes the transitions to be picked up by the workers.

Adapters

Doc Store and State Store

Currently the only supported database is CouchDB, but technically any document database is supported. It should, by the way, store every version of the documents (as CouchDB does) if you want to be able to roll-back to certain versions of the documents.

The module for supporting CouchDB is banzai-couchdb-store.

Queue

Currently we support Redis (any version >= 2.1.7) if you use the banzai-redis module, but any queueing system that allows the same semantics should work.

Show me the code!

Check out the project README.

Text

(How I like to build classes in Javascript)

When developing server-side Javascript for Node.js, generally I tend to encapsulate classes inside CommonJS modules and expose the constructor function as module itself.

As an incomplete example of how I used to do it, let’s build a module that exposes a rectangle class:

function Rectangle(x, y, width, height) {
  this.x = x;
  this.y = y;
  this.width = width;
  this.height = height;
}

Rectangle.prototype.area = function() {
  return this.width * this.height;
};

module.exports = Rectangle;

Let’s say you save this module under the name “rectangle.js”, on the current directory.

Then, to instantiate a rectangle you must do:

var Rectangle = require('./rectangle');
var rectangle = new Rectangle(1,2,3,4);
rectangle.area(); // -> 12

All is fine and dandy, right?

Nope. This way you can tamper with the rectangle object, changing properties and even overriding functions. I think this is not a major problem, but exposes a major design flaw, which I’ll cover later.

Now you want to add a private function. You have two main options: 1) add it as a function on the module scope or 2) add it as a function on the Rectangle.prototype object, but giving it an underscore so everyone knows they shouldn’t be calling.

Lets’ say for the purpose of the example, that you want to add a provate function named “coalesce”, which you want to call after the constructor.

1) Add it to the module scope

function coalesce() {
  var self = this;
  ['x', 'y', 'width', 'height'].forEach(function(prop) {
    if (!self[prop]) { self[prop] = 0; }
  });
}

function Rectangle(x, y, width, height) {
  this.x = x;
  this.y = y;
  this.width = width;
  this.height = height;
  coalesce.apply(this);
}

Rectangle.prototype.area = function() {
  return this.width * this.height;
};

module.exports = Rectangle;

Here we can see the constructor calling the “coalesce” function using the function.apply(), which sets the “this” scope, which then the coalesce function can use as the object.

2) Add it as a function on the Rectangle.prototype

function Rectangle(x, y, width, height) {
  this.x = x;
  this.y = y;
  this.width = width;
  this.height = height;
  this._coalesce();
}

Rectangle.prototype.area = function() {
  return this.width * this.height;
};

Rectangle.prototype._coalesce = function() {
  var self = this;
  ['x', 'y', 'width', 'height'].forEach(function(prop) {
    if (!self[prop]) { self[prop] = 0; }
  });
};

module.exports = Rectangle;

This way is simpler, but we’re exposing the coalesce function, which is ugly.

The problem

As I said earlier, this pattern exposes the methods and the data on the rectangle object.

The ultimate goal would be to expose the methods and encapsulate the data. How can we do that?

Here is a solution I like to use:

function Rectangle(x, y, width, height) {

  function area() {
    return width * height;
  };

  function coalesce() {
    if (! x) { x = 0; }
    if (! y) { y = 0; }
    if (! width) { width = 0; }
    if (! height) { height = 0; }
  }

  coalesce();
  return {
    area: area
  };
}

module.exports = Rectangle;

And a client of this module would look like:

var Rectangle = require('./rectangle');
var rectangle = Rectangle(undefined, undefined, 3, 4);

What have we done here?

The constructor simply returns an object that has the methods we want to expose. The data is encapsulated inside the constructor function, which also contains all the functions (private and public) that have privileged access to these.

Then we’re dropping the using of “new” notation on the class clients (which could cause a lot of problems on the previous model if module clients omitted it).

This pattern also allows for object methods (private or public) to call each other with no restraints, since we are not relying on the leaky this object.

What about inheritance?

A useful way of declaring that a class (or pseudo-class, if you will) inherits from another one is having the constructor prototype pointing to an object that it “inherits” behavior from. Node (and almost all the Javascript frameworks) has convenience function for doing this in util.inherit().

For instance, say you want our Rectangle class (as in our first incarnation) inheriting from the Node EventEmitter class:

var inherit = require('util').inherit
  , EventEmitter = require('events').EventEmitter;

function Rectangle(x, y, width, height) {
  this.x = x;
  this.y = y;
  this.width = width;
  this.height = height;
}

inherit(Rectangle, EventEmitter);

Rectangle.prototype.area = function() {
  return this.width * this.height;
};

module.exports = Rectangle;

Convenient, heh? (You must be careful to call inherit before setting the prototype properties, or else they will be nuked). How can we then implement inheritance if we’re not using the tradicional Javascript constructor functions?

Here is a way:

var EventEmitter = require('events').EventEmitter;

function Rectangle(x, y, width, height) {
  var that;

  function area() {
    return width * height;
  };

  function coalesce() {
    if (! x) { x = 0; }
    if (! y) { y = 0; }
    if (! width) { width = 0; }
    if (! height) { height = 0; }
  }

  coalesce();

  that = {
    area: area
  };

  that.__proto__ = EventEmitter.prototype;

  return that;
}

module.exports = Rectangle;

So, we’re using the __proto__ object, which is reserved in Javascript for the actual prototype object. So if you call any EventEmitter-specific methods like on() and emit(), the runtime will look into the rectangle object, and if not found, will search inside the prototype chain.

Mind you that the __proto__ object is not entirely portable to all Javascript platforms and browsers, but there are ways around that.