Tuesday, September 14, 2010

First days of node.js and mongodb

I'm putting together a little web application to manage Delphi Method style collaborative priority setting and decision making. I'm using it as a chance to try out node.js, mongodb with a native mode driver, and express.js.

Altogether they make a sweet team. I'm loving the lack of the typical language impedance mismatch in this type of project. And the possibility of sharing code between client and server without adding the bulk of something like GWT or cappuccino.

However I'm not happy with the mongodb driver interface style for node.js. Consider the following variation on it's simplest example, which checks if a record exists and adds it if it doesn't:


var db = new Db('my_db',
new Server(host, port, {}),
{native_parser:true});
var query = {'key': 'value'};
var record = {'key': 'value', 'data', 'stuff'};

db.open(function(err, db) {
db.collection('test', function(err, collection) {
collection.find(query, function(err, cursor) {
cursor.toArray(function(err, items) {
if (items.length > 0) {
db.close();
sys.puts("query already exists");
} else {
collection.insert(record, function(err, docs) {
db.close();
sys.puts("record has been added");
});
}
});
});
});
});


This mongodb driver follows the node.js idiom for non-blocking IO, providing a function which will be called when the IO is complete. Each of the nested functions above is not being executed immediately. They are deferred until the mongo database interaction is done, giving control back immediately to the caller, playing nice in a cooperative multitasking environment.

But the simplistic code style presented above and in sample code encourages code duplication. And frankly naming anonymous functions is one of the first things to do if you want other folk to be able to read your JavaScript code. Let's pull those anonymous function out and see if it's any clearer.


var database = new Db('my_db',
new Server(host, port, {}),
{native_parser:true});
var db;
var collection;
var query = {'key': 'value'};
var record = {'key': 'value', 'data', 'stuff'};

openMyDb(database);

function openMyDb(database) {
database.open(openTheTestCollection);
}

function openTheTestCollection(err, database) {
db = database
db.collection('test', findRecordsWithKeyAndValue);
}

function findRecordsWithKeyAndValue(err, coll) {
collection = coll;
collection.find(query, convertingTheCursorToAnArray);
}

function convertingTheCursorToAnArray(err, cursor) {
cursor.toArray(insertKeyAndValueIfRequired);
}

function insertKeyAndValueIfRequired(err, items) {
if (items.length > 0) {
db.close();
sys.puts("query already exists");
} else {
collection.insert(record, closeDb);
}
}

function closeDb(err, docs) {
db.close();
sys.puts("record has been added");
}


Notice how there needs to be a db and a collection variable to share those things between the functions. But that means that this is no longer safe in face of parallel operation.

The nasty nested version was necessary because those nested contexts were creating new variables needed for safe parallel operation. I can't reuse the nested functions, but I can't safely use the non-nested functions.

A solution might be something like a push forward context along side the function parameter:


openMyDb(new Db('my_db',
new Server(host, port, {}),
{native_parser:true}),
{'key': 'value'},
{'key': 'value', 'data', 'stuff'});

function openMyDb(db, query, record) {
db.open(openTheTestCollection,
{'query':query, 'record': record});
}

function openTheTestCollection(err, db, context) {
context.db = db;
db.collection('test', findRecordsWithKeyAndValue, context);
}

function findRecordsWithKeyAndValue(err, collection, context) {
context.collection = collection;
collection.find(context.query, convertingTheCursorToAnArray, context);
}

function convertingTheCursorToAnArray(err, cursor, context) {
cursor.toArray(insertKeyAndValueIfRequired, context);
}

function insertKeyAndValueIfRequired(err, items, context) {
if (items.length > 0) {
context.db.close();
sys.puts("query already exists");
} else {
context.collection.insert(context.record, closeDb, context);
}
}

function closeDb(err, docs, context) {
context.db.close();
sys.puts("record has been added");
}


This would give me a safe way to reuse these functions and make it possible build up some nice reusable chunks for things that are being duplicated at the moment, (like opening a database and a given collection).

I can easily imagine the driver itself copying things into the context, like I've done above with db and collection, and that would slim things down even further.

Perhaps I should fork the driver and try out those changes myself, as I've presented them they wouldn't even break any existing code.

No comments:

Post a Comment