Thursday, March 19, 2015

Viewing JavaScript console output on mobile devices

The Problem

Most mobile devices come with little if any debugging support. Debugging is usually done by connecting the device to a host computer (e.g., iPhone to a Mac) and using tools on the host computer. That's great if you have an iPhone and a Mac. But what if you Just Want to see what the JavaScript console is putting out, since you went to the trouble to put so much helpful debug information there?

The Solution

Give your console some soul

...Is barely even worth of a blog post, or 'blst' for short (as the kids here at SXSW are saying this week). Essentially, the secret is to hook the console's log function, then use it for our own purposes. At that point we can choose to forward the log statements on to the console, or 'swallow' them and only let our code have access.

The magic happens with just a few lines of code:


// console.log is just a function... store a reference
var oldf = console.log;

// redefine it and insert our own code
console.log = function () {

    // use the console output for our own purposes 
    someFunctionToShowConsole(arguments[0]);

    // also send it to the console, if we want
    oldf.apply(console, arguments);
}

Everything else is just window dressing. But, in order to give us a nice little debug window that doesn't take up too much of the page and lets us scroll etc., etc., we'll wrap it all up in a div docked to the bottom of the page, and show timestamps for log entries, and add a way to clear the console if the log gets too long.

When we're done, it should look something like this:


The 'real' (Chrome) console is on the right, our faux console is on the left.

The full code is below, and any updates will be posted to https://gist.github.com/jasonbrice/8131e38c14695afb4127


// set an identifier for the console div
var consoleId = 'debug_console';

// initialize the debug console, dock to bottom of page
var debugConsole = function () {

    var body = document.getElementsByTagName('body')[0];

    // since this function also gets called on clear(),
    // we'll remove the div if it exists so we can re-add it
    if (document.getElementById(consoleId))
        body.removeChild(document.getElementById(consoleId));

    // create new div and set whatever attributes we want
    var div = document.createElement('div');
    div.id = consoleId;

    // dock to bottom of page. styles are merely a matter of preference: move, color, change to suit.
    div.setAttribute('style',
        'position: fixed; bottom: 0px; left: 0px; width: 100%; height: 200px; color: #fff; background: #666; overflow: scroll;'
    );

    // make our new div part of the DOM
    body.appendChild(div);

    // add a link to call this function
    var href = document.createElement('a');
    href.innerHTML =
        "<a href='javascript:debugConsole();' style='text-decoration: none;'>" +
        "<span style='text-decoration: none; color: white;'>Clear</span></a><br/>";

    div.appendChild(href);

    // hook the console output function
    var oldf = console.log;
    console.log = function () {

        // use the console output for our own purposes
        div.innerHTML +=
            '<br/><span style="font-size: 12px; font-family: monospace;">' +
            timestamp() + '.$ ' + arguments[0] + '</span>';

        // also send it to the console
        oldf.apply(console, arguments);
    }
}

// timestamp adapted from https://gist.github.com/hurjas/2660489
var timestamp = function timeStamp() {

        // Create a date object with the current time
        var now = new Date();

        // Create an array with the current month, day and time
        var date = [now.getFullYear(), (now.getMonth() < 9 ? "0" : "") +
            (now.getMonth() + 1), now.getDate()
        ];

        // Create an array with the current hour, minute and second
        var time = [now.getHours(), now.getMinutes(), now.getSeconds()];

        // Determine AM or PM suffix based on the hour
        var suffix = (time[0] < 12) ? "AM" : "PM";

        // Convert hour from military time
        time[0] = (time[0] < 12) ? time[0] : time[0] - 12;

        // If hour is 0, set it to 12
        time[0] = time[0] || 12;

        // If seconds and minutes are less than 10, add a zero
        for (var i = 1; i < 3; i++) {
            if (time[i] < 10) {
                time[i] = "0" + time[i];
            }
        }

        // Return the formatted string
        return date.join(".") + " " + time.join(":") + " " + suffix;
    }
    // initialize the console once the rest of the

// document has finished loading
window.onload = debugConsole;

Conclusion

Even in 2015, there is no nice and convenient way to debug web pages on mobile devices. This solution would be even cooler if we could look at raw code and set breakpoints. But for now we can at least know what the console is spitting out, which, let's face it, is probably how the majority of NodeJS debugging is done.

Happy debugging, and if you like it, steal it!

Sunday, February 22, 2015

Blazing fast JSON query results with sockets

TL;DR

Clone this repository and change the database connection string to kickstart a Socket based CRUD app: https://github.com/jasonbrice/stealthis_sockets

Um Wut


For today's funfest, we'll be serving up some relational data over the web via websockets, and using a veritable buzzword-a-palooza of hawt toolz like: NodeJS (JavaScript engine), socket.io (websockets), Postgresql (database), and BookshelfJS/Knex (ORM layer).

Why would we do such a thing?

Let's imagine that you are a developer responsible for creating a web-based API that handles basic CRUD operations for a relational data store. Further suppose that the 'web based' part of the requirement means that your query language and results format are both JSON.

Let's make one final assumption, that you think writing endless layers of DAO objects and repeatedly testing for and debugging column/property mismatches feels a bit like going to the dentist with a sock full of fire ants.

Still Reading?


Let's get started. You're going to need NodeJS. Don't have NodeJS? This handy guide will help you.

First, pick an empty directory somewhere and create a text file called package.json. Make the contents of package.json the following:

{
  "name": "MahApp",
  "version": "0.0.0",
  "private": true,
  "scripts": {
    "start": "node app.js"
  },
  "dependencies": {  }
}

Next create a text file called app.js. The contents of app.js are these:


console.log('steal this tech!');


To verify that you installed NodeJS correctly, navigate to the project root directory with a command-line tool, and run the command:

npm start


You should see output similar to:

$ npm start

> MahApp@0.0.0 start c:\source\code\blog
> node app.js

steal this tech!


Congrats, you're almost ready to apply for the CTO gig at Google. Just a few more bits to go.

Next we're going to install some dependencies. First, let's make our new app a proper http/ws server.


npm install --save express express.io && npm list


Hopefully when this is done you see a dependency tree with express and express.io at the top levels.

BIG DISCLAIMER: depending on your environment (especially a Windows machine without various developer tools) this command might create some fireworks. It's beyond the scope of this post to set up all the dependencies you need for node-gyp or a C++ toolchain, but this is an easily solved problem with a bit of Google-Fu.

Once you get express and express.io installed, you're ready to start some servitude.

First, let's beef up our app.js to listen for http and ws connections. app.js should now look like:


app = require('express.io')()

app.http().io()

app.get('/', function(req, res) {
    res.sendfile(__dirname + '/index.html')
})

app.io.route('ready', function(req) {
    req.io.emit('status', 'A full-on websocket is at your service (' + new Date().toString() + ')');
})

var port = process.env.PORT || 3101;

console.log("steal this tech is listening on port " + port);
app.listen(port);


We've done a couple of things here: initialized a listener on port 3101, and told the http server to serve a file called index.html when someone makes an http request at the root. Let's create an HTML file that will then make a socket connection to the server. This piece is more or less stolen outright from the express.io examples


<html>
<script src="/socket.io/socket.io.js"></script>
<script>
  io = io.connect()

  // Emit ready event.
  io.emit('ready')

  // Listen for the status event.
  io.on('status', function(data) {
    document.body.innerHTML = data;
  })
</script>
<body></body>
</html>

Now, start the app again with 'npm start' and navigate to http://localhost:3101 with a modern browser. Preferably Chrome. Do you see a message? Good! Here's what happened:

  1. The app started up and began listening for both http and ws protocol requests on port 3101.
  2. By navigating to the root with a browser, you sent an http request, to which express replied by sending you the index.html
  3. Then, the index.html page loaded a javascript socket library, created a socket connection to express.io, and fired an event ('ready') to let the server side know it was ready
  4. The server side heard the event and fired its own event ('status') with a message payload
  5. The client (browser) side received the server's status event, and made the contents of the message payload visible. 

Cool!

In case you didn't already know this amazing tidbit, a websocket is a live connection between your browser and a server. Unlike "stateless" http, where a request and a response are sent (along with headers, cookies, and other attempts to mimic statefulness) for every request, even if it's just a teeny ajax call to update a few bytes of data. With websockets, transferring a few bytes of data only costs you a few bytes of data (nitpickers - I know this is a bit of an over simplification. It's a blog, not a PhD thesis).

Let's make this a bit more useful than just transferring cute messages between a client and a server.

Part II: You Know I'm All About the (Data)Base


For the next part of the show, we're going to create a database with a few tables. For the sake of following along, you'll either need to download/install/configure Postgresql and set up a database, or, do what I'm going to do and create a free account on elephantsql. Either way, I'm going to gloss over the initial setup of the database part other than to say by the end of it you should have a new database/schema capable of being connected to with a Postgresql connection string.

Also, if you don't have one already, you might want a graphical tool for developing and visualizing the database objects we'll be creating. Posgresql comes with a pretty good one (pgAdmin) which you can download as a standalone package if you don't want the entire Postgres footprint on your machine.

Ok, so following the ElephantSQL setup guide, I've got me a free hosted database I can access from my local machine.

Does this admin client make me look phat?

We're going to set up a very, very simple small database with three tables: a person, a login, and an address. The point is not to show off data normalization techniques, rather let's do something that could be the basis for an authentication system.

Putting the ERD in NERD


The DDL required to generate this structure can be found here.

Here comes the CRUD

In the bad old days, Conestoga Wagons hauled wooden barrels of SQL around that were carefully unpacked and hand-wired directly to CGI scripts, and later, VB.NET asp pages. These dark days eventually gave way to ORM frameworks like Hibernate, which let you replace pain-staking hard-coded SQL with pain-staking hard-coded XML. And there was much CASCADE confusion, and many Lazy Loading Exceptions were thrown, and lo there came the OpenSessionInView filter, and Spring eventually came along and washed away many of our tears.

These days, even with Microsoft's most bleeding edge version of Entity Framework, you still gotta write a metric boatload 'o code to do a full set of Create Read Update Delete operations from a web page to the database and back.

One day that will seem as dumb as leeches and phlogiston theory, but for now let's at least console ourselves with these shining beacons of sanity: BookshelfJS and Knex.

Back at our command line, let's install them:


npm install --save knex bookshelf pg

Knex is the "low level" library that will be generating the SQL statements, and Bookshelf is the higher ORM library that will let us identify the relationships between objects and tables with far, far less coding than you would need with Spring / Hibernate / Entity Framework. PG is the particular dialect we've chosen, but you could swap that out for mysql / sqlite / etc.

BookshelfJS Models & Collections

Let's get to cruddin'! At the root level of the app, create two new directories: models and collections. We'll put our BookshelfJS models in - you guessed it. We'll refer to arrays of them with collections.

For our first model, let's create a person. Create a new file called model/person.js. It should look like this:


/**
 * This is the model representation of the person table. It represents a single person.
 */

var Bookshelf = require('bookshelf').DB;

exports.model = Bookshelf.Model.extend({
  tableName: 'stealthis.person',
  idAttribute: 'id'
});

That's it. That's our model "glue" code. Now a collection. Create a new file called collections/people.js. It should looks like this:


/*
 * This represents a collection of all entries in the person table. We will use this for our list method.
 */

var Bookshelf = require('bookshelf').DB;

var Person = require("../models/person").model;

exports.collection = Bookshelf.Collection.extend({
  model: Person,
  tableName: "stealthis.person"
});

Now we're ready to connect Bookshelf to Knex, and then to a realtime socket.io route. The app.js should now look like this:



app = require('express.io')()

app.http().io()

app.get('/', function(req, res) {
  res.sendfile(__dirname + '/index.html')
})

var port = process.env.PORT || 3101;

app.io.route('ready', function(req){
  req.io.emit('status', 'A full-on websocket is at your service (' + new Date().toString() + ')');
})

// New code below! 

// Set up a connection to our database with a new Knex instance
var pg = require('knex')({
  client: 'pg',  debug: true,  connection: "postgres://****@babar.elephantsql.com:5432/mfifoqjd"  ,
  pool: {    min: 2,    max: 5  }
});

// Bring in Bookshelf!
var Bookshelf = require('bookshelf');
Bookshelf.DB = Bookshelf.initialize( pg );

// Make the app aware of our model
var People = require('./collections/person').collection;
var Person = require('./models/person').model;

// Define some realtime routes
app.io.route('person', {
    create: function(req) {
        // create a person
 Person.forge(req.data).save().then(function(person){
          req.io.emit('response', person)
        });
    },
    list: function(req) {
  // list all people
         new People().fetch({}).then(function (collection) {
           req.io.emit('response', collection)
         });
    },
});

// End of new code!

console.log("steal this tech is listening on port " + port);
app.listen(port);

Ok. That was a lot. We created a new Knex instance, then initialized Bookshelf with it. Then we told socket.io how to do realtime routing with create and list Person methods, inside which we made Bookshelf calls that do the lifting and send SQL via Knex.

Next comes mind === 'blown'


Start the app (npm start). Point a browser at your app (http://localhost:3101). Open a JavaScript console. Type the following in the console:


io.emit('person:create', {firstname: 'Ray', lastname: 'Charles'});

You should see something like:



Now. Open your Postgres SQL client and type: SELECT * FROM stealthis.person

You should see:

Are you feeling awesome? You should be. You just sent a JSON object from your freakin' console to a real actual hardcore database with like a few lines of JavaScript.

But we're not done getting more Awesomer. Not yet.

Part III: Rocking the Proverbial Casbah

At this point you could just download the finished code, tweak it, and run it yourself. But for the curious, here are a few final problems to solve.

1) We need to finish wiring up relationships between objects. Let's add the Login and Address classes, and make sure they all have the right CRUD methods.
2) Our app.js is starting to get crowded. If we add lots and lots of realtime routes, it'll be a big mess to understand and debug. We don't want a single misplaced semicolon to cause our app not to start, so let's break routes out into their own files.
3) Let's add a hashing library so that we're not storing passwords in cleartext.
4) We need to update index.html to be able to give us status updates.

So. Create a directory called realtime_routes. In realtime routes create a file called person.js. That file should look like this:


var People = require('../collections/person').collection;
var Person = require('../models/person').model;


module.exports = {

    list: function list(req) {
        new People().fetch({}).then(function (collection) {
                req.io.emit('response', {status: 200, results: collection})
            }
        )},

    select: function select(req) {
        Person.forge({id: req.data.id}).fetch({}).then(function (person) {
                req.io.emit('response', {status: 200, results: person})
            }
        )},

    create: function(req) {
        Person.forge(req.data).save().then(function(person){
            req.io.emit('response', {status: 200, results: person})
        });
    },

    update: function(req) {
        if(req.data.id){
            Person.forge(req.data).save().then(function(person){
                req.io.emit('response', {status: 200, results: person})
            });
        }else
        {
            req.io.emit('response', {status: 500, results: {error: "Could not update user", reason: "id was not specified"}});
        }
    }
};

And we can reduce the new code in app.js to:


// New code below! 

// Set up a connection to our database with a new Knex instance
var pg = require('knex')({
  client: 'pg',  debug: true,  connection: "postgres://****@babar.elephantsql.com:5432/mfifoqjd"  ,
  pool: {    min: 2,    max: 10  }
});

// Bring in Bookshelf!
var Bookshelf = require('bookshelf');
Bookshelf.DB = Bookshelf.initialize( pg );
// Reference the person route in a different file...
app.io.route('person', require('./realtime_routes/person'));

// End of new code!
// Le snazzee, n'est-ce pas?

Ok, a few more things for this demo then it's all yours. We should:

  1. Add realtime_routes for login and address, as well as require directives for them in the main app.
  2. Add an authenticate method for our Login model

Add Realtime Routes

Create realtime routes for address and login. You can copy/paste and find/replace using the working person.js route.

BUT, for login, let's delete all but the create method. And, let's add an authenticate method:


 authenticate: function(req)
    {
        if(req.data.email && req.data.password)
        {
            req.data.password = sha512(req.data.password).toString('hex');
            Login.forge( {email: req.data.email, password: req.data.password}).fetch({})
                .then(function (login) {

                login = login.toJSON();

                if(login.password) delete login.password;

                Person.forge({id: login.personid}).fetch({}).then(function (person) {
                    req.io.emit('response', {person: person, login: login});
                })
            }).catch(function(error){
                console.log(error);
                req.io.emit('response', 
                    {status: 500, error: 
                       {message: "Could not complete request", 
                        reason: "Credentials not found or database is down"}});
            });
        }
        else
        {
            req.io.emit('response', 
               {status: 400, error: 
                 {message: "Malformed request", reason: "Missing email or password"}});
        }
    },

We're also going to modify the login create method to scramble the password using the sha512 library. (DISCLAIMER - there are many other things we need to consider to make this app secure, but let's at least hash the password to keep our database safe from kids in treehouses.)

Now login create looks like this:


create: function(req) {

        if(req.data.password){
            req.data.password = sha512(req.data.password).toString('hex');
        }

        Login.forge(req.data).save().then(function(login){
            login = login.toJSON();
            if(login.password){ delete login.password; }
            req.io.emit('response', {status: 200, login: login})
        }).catch(function(error){
            console.log(JSON.stringify(error));
            req.io.emit('response', 
                {status: 500, error: 
                   {message: "Could not create login", reason: error.detail}});
        });
    }

You'll need to use npm to install sha512, and add a line at the top of login.js to require it.

Test that the login route is working by creating a login:

io.emit('login:create', {email: 'ray@atlanticrecords.com', password: 'Ray', personid: 3});

If it worked, you'll get a login object and a status object back. Otherwise, you'll get an error with a (hopefully) helpful message. (Make sure to swap out the personid with the correct primary key of the person record.)

Then, try and authenticate using your newly created login:

io.emit('login:authenticate', {email: 'ray@atlanticrecords.com', password: 'Ray'});


Next Steps

If you want to take this app further, here are some things you could do:


  1. Wire up the rest of the object model, and add to it. The completed project has a relationship between address and person but we didn't cover it here. 
  2. Worry about validation and security. You don't want just anybody making updates to your database.
  3. Create unit tests to automate the task of validating your mappings
  4. Move the entity relationship handling to Bookshelf. Right not we're giving the route an outsized role (e.g., the method to get a person from a login is handed-coded... Bookshelf can handle that for us! Maybe a follow-on blog will tackle this.)

And don't forget - if you like it, STEAL IT! https://github.com/jasonbrice/stealthis_sockets


Monday, September 9, 2013

A Very Simple Demonstration of Node.js/Express and SQLite

Obligatory Boring Preamble

At the risk of disappointing all zero of my loyal followers, today we'll take a break from the C#.NET ecosystem and try something totally different.

Server side JavaScript got pretty popular about two years ago (about 2011 - in case you're reading this from the future), and it has matured to the point where new-"ish" tech adopters like me are willing to give it a try - actual tools exist, people are using it for real projects in production, and it's no longer attracting as much attention from "latest thing" fad seekers. (Don't get me wrong. I love new things - but I have learned to love new things that are *useful* for doing *work*.)

So for today's installment, let's create an app that brings these two great tastes together, and as usual, share the goods for you to (hopefully) steal.

The Technology Stack

Like most of the stuff I have the luxury of posting on a public blog, this is going to be pretty basic.

For development, we're going to use:

  • Node.js
  • Eclipse (free multi-language IDE available here)
  • Nodeclipse (free Eclipse plugin for developing Node apps, available here)
  • Node Package Manager (NPM) (free package manager for Node.js modules available here)


And once we've got NPM installed and working, we're going to use it to install a few packages:

  • Express.js ("a minimal and flexible node.js web application framework")
  • Jade (a templating engine for generating our HTML user interface)
  • node-sqlite3 (Asynchronous, non-blocking SQLite3 bindings for Node.js)


The Finished Product

We'll end up with a small Node.js app that allows you to create and view SQLite3 databases, and modify some of their elements (create new tables and columns). The app will also demonstrate a few elements of Express.js - the routing of get and post requests, as well as some basic features of templating with Jade.

If you want to play with it yourself, you can clone the project files from my Git repository after following along with the setup we'll do.

Setup

First, download and install Node.js from http://nodejs.org/.

Next, install Eclipse (and any pre-requisites like Java) from http://eclipse.org/recommenders/download/.

Once Eclipse is installed, install the Nodeclipse plugin:

  1. Go to the Eclipse's Help menu and choose Install New Software
  2. In the "Work With" field, paste this URL: http://www.nodeclipse.org/updates/
  3. Click the 'Add' button


Building the dev environment


When you install Node.js, it puts an icon in the main Start menu, as well as a program group.

This one will let you run Node.js server-side JavaScript in a console and see the output directly.


So, typing "Hello World" at the prompt will echo "Hello World". If you've ever used Firebug for trying out JavaScript, this does more or less the same thing.

What you'll probably end up using more often is the node command prompt:


This lets you start/stop/restart Node.js applications as well as use the Node Package Manager (hereafter referred to as npm). For starters, let's just see what versions of everything we've got:


So far so good! 

Next let's create a new Node project in Eclipse, so that we can start installing modules and get a skeleton site up.

In Eclipse, choose New > Node > Node Express Project > Next

Give you project a name and click Finish. Look at the output in Eclipse's console window:


Note that the Nodeclipse plugin has already done some work for us: it used npm to install 2 local node modules: Express (the web framework we'll be using to route requests and responses), and Jade (the templating engine we'll use to generate HTML output to the browser).

(Note, if your output looks garbled you might have Eclipse's text file encoding set to the wrong character set. Set it to UTF-8 to get the output shown above. See here for more).

You could have done this yourself by opening a Node.js command prompt, navigating to the root of your project directory, and typing: 

npm install express
npm install jade

Ok. It's time to do our first "smoke test" (fire it up, see if it smokes). While still in the Node.js command prompt, type:

node app.js

Open a browser, and open the URL http://localhost:3000

You should see the following:


And if you look back at your console output, you'll see some activity:



Looking at the project structure in Eclipse, we can see that (for now), app.js is where the action is. It's the entry point into our application, as well as containing the code that sets up the environment and starts the http listener. 

We also see that the Nodeclipse plugin generated a few other files and directories: 
  • views/layout.jade - the basic page template that other jade views extend
  • views/index.jade - the default landing page for our app
  • routes/index.js - the code that tells Node.js/Express to render the index page
  • Stylesheets, etc.

Adding a simple page


First, create a new express route. In your Eclipse project:
  1. Right click on routes
  2. Choose New > Node Source File
  3. Call it 'test.js' and click Finish
  4. Copy and paste the contents of index.js into test.js
  5. Replace exports.index with exports.get
  6. Replace res.render('index', { title: 'Express' }); with res.render('test', { title: 'Test Page' });

Next, create a new jade template:
  1. Right click on views
  2. Choose New > General > File
  3. Call it 'test.jade' and click Finish
  4. Copy and paste the contents of index.jade into test.jade
  5. Replace p Welcome to #{title} with   p This is a #{title}

Next, register the route as a module dependency in app.js. 
  1. Open app.js
  2. After the line var path = require('path'); add var test = require('./routes/test');
  3. After the line app.get('/users', user.list); add app.get('/test', test.get);
Try it out - stop the Node.js app by typing <CTRL> C, then run node app.js again. This time, navigate to http://localhost:3000/test. If it all went Ok, you should see:




A few notes about what we did. For simplicity, I used the word 'test' for everything - the name of the route, view, variable, etc. But you don't have to! Node.js/Express doesn't use "convention over configuration". The things to know are:
  • Whatever you name the view file needs to match the call to render it. So you could have a file called wombat.jade and call it from test.js.get() with
      res.render('wombat', { title: 'Test Page' });
  • You can call the variable in app.js anything you want, as long as its pointing to the right file (and calls the same methods). So
      var aardvark = require('./routes/test'); would work, as long as you called the aardvark.get() method. 
  • You can call the method in test.js anything you want, as long as you register it in app.js. So 
      exports.jabiru(){...}
    would work as long as you changed the line in app.js to
      app.get('test', test.jabiru); 
  • You can map the URL from anything you want. So    app.get('/Mah/RESTFul/Test', test.get); would work. This is pretty powerful (well, if you grew up with much more awful oldschool ways to map URLs to code).

    You can also use wildcards, e.g.,
      app.get('/Mah/*/Test', test.get);
    or placeholders for variables, so that 
      app.get('/products/:productName/details', product.details); would pass whatever is in the section of the URL between 'products/' and '/details' as a variable for the product.details method to use.

Let's try one more tweak to our test view/route before we move on to the good stuff - let's add a post method to accept data from the browser.
  1. Add the following method to test.js:

    exports.post = function(req, res){
     
        var name = req.body.name;
     
        console.log("Received POST data: " + name);
     
        res.render('test', { title: 'Test Page designed *exclusively* for ' 
            + name });
     
    };
    
    
  2. Map the method in app.js. Add:

    app.post('/test', test.post);
  3. Extend the jade template. The complete page should read:
    
    extends layout
    
    block content
      h1= title
      p This is a #{title}
      
      div
        form(method='post')
          p Enter your name:
          input(type='text',name='name',id='name')        
          input(type='submit',name='submit',value='Submit')
    
Note that jade is indent-sensitive! Indented elements are considered to be "children" of the elements before them. So the jade file above will result in the following HTML:

<!DOCTYPE html>
<html>
  <head>
    <title>Test Page</title><link rel="stylesheet" href="/stylesheets/style.css">
    </head>
  <body>
    <h1>Test Page</h1>
    <p>This is a Test Page</p>
    <div>
      <form method="post">
        <p>Enter your name:</p>
        <input type="text" name="name" id="name">
        <input type="submit" name="submit" value="Submit">
      </form>
    </div>
  </body>
</html>

Ok. Restart node (<CTRL> C, then node app.js from the command prompt) and refresh the page. Enter your name and click Submit - but first I should warn you. Prepare to be amazed!



Alright. Now for the good stuff. Let's get sqlite cooking!

Adding SQLite to the mix with node-sqlite3

If you don't have it already, download and install SQLite, and Python. Now we need a wrapper - e.g., an interface between the database and Node.js. Luckily there's node-sqlite3.

Let's use npm to install node-sqlite3:

  1. Open the Node.js command prompt from Start > All Programs > Node.js > Node.js command prompt.
  2. Navigate to your project directory (probably something like c:\users\[username]\workspace\[project]).
  3. Type the following command:

npm install sqlite3

If you would like to run the sqlite3 unit tests provided (always a good idea), you'll need to install mocha as well. (IMPORTANT NOTE! You must first navigate to the Sqlite3 directory inside your project - it should be in c:\users\[username]\workspace\[project]\node_modules\sqlite3 - otherwise the tests won't execute...).

npm install mocha
npm test

If everything went well you should see output similar to the following:




Let's create a SQLite database and see if we can spit some data out to the browser.

Create a new database

Open a plain old Windows command prompt and navigate to the root of your project directory. We're going to create a new database called test.db, add a table called 'foo', and insert a few rows of dummy data for testing purposes. Like so:


C:\Users\bricej\workspace\nodetest>sqlite3 test.db
SQLite version 3.7.17 2013-05-20 00:56:22
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> create table foo(id integer primary key asc, some_text text);
sqlite> insert into foo(some_text) values('howdy');
sqlite> insert into foo(some_text) values('there');
sqlite> insert into foo(some_text) values('folks!');
sqlite>

(I will spare us all the rant about always, ALWAYS having a surrogate key as a table's primary key, other than to say you should ALWAYS ALWAYS use a surrogate key. The fact that there is even a debate about this makes me sad. If you ever use a 'natural' key, you are signing up for some serious and painful data surgery at some point, buddy!)

Ok, now we're going to rinse and repeat the steps we used to create our fabulous test post page above.

First, create a route, called table_view.js:

/**
 * Table view
 */
var sqlite3 = require('sqlite3').verbose();
var db;

exports.get = function(req, res){
 console.log("Serving " + __filename);
 
 var db_name = req.params.db_resource;
 var table = req.params.table;
 
 var dbpath = db_name;
 db = new sqlite3.Database(dbpath, function(err){
  if (err) res.render('index', {title: 'Error: ' + err}); 
 });
 
 db.all(" select * from " + table, function(err, rows) {
        
        var message = rows.length > 0 ? "Viewing " 
             + db_name + '/' + table 
             : "No data found in table '" + table + "' in " + db_name;
        
        res.render('table_view', {message: message, rows: rows});
        db.close();
    });
};

Next, create a jade view called table_view.jade:

extends layout

block content
  h1= message
    
    h3 Data
    table
      - for (var r in rows)
          tr
            td= rows[r].id
            td= rows[r].some_text

  br

Finally, add a reference and a mapping to the app.js
var table_view = require('./routes/table_view');

and
app.get('/database/view/:db_resource/:table', table_view.get);

Now, if you restart the app and go to http://localhost:3000/database/view/test.db/foo you will see:



As in the previous example, the JavaScript file does most of the "heavy lifting" of querying the database and making the output available to the view. The jade view formats the output using a (hopefully) easy to understand templating syntax. The app (and Express framework) handle the routing of the request, so that we can create nice, RESTful URLs to our code.

You can see how easy it is to use these simple components to feed data to a browser - and can imagine extending the form example above to build more sophisticated data querying capabilities.

A few more additions and tweaks

Let's expand what we've got so far to meet the following requirements:

  • Add the ability to create a new database on the fly without having to use the command line
  • Add the ability to list existing databases
  • Add the ability to add tables to any database in the list
  • Add the ability to add columns to any table
While we're at it, let's refactor our code so that:
  • The database specific stuff is in its own subdirectory, so that the root doesn't look so cluttered.
  • Verbose console messages are only printed while in "debug" mode.
When we're done, our file tree should look like this:

Project Root
|   app.js
|   package.json
|   README.md
|   
|       
|---databases
|       test.db
|       
|---node_modules
[Ommitted for space]
|                   
|---public
|   |---stylesheets
|           style.css
|           
|---routes
|   |   echo.js
|   |   index.js
|   |   user.js
|   |   
|   |---database
|           db_create.js
|           db_edit.js
|           db_list.js
|           table_edit.js
|           table_view.js
|           
|---utils
|       date_time.js
|       environment.js
|       reflection.js
|       
|---views
    |   echo.jade
    |   index.jade
    |   layout.jade
    |   
    |---database
            db_create.jade
            db_edit.jade
            db_list.jade
            table_edit.jade
            table_view.jade
 
We'll also update the app.js so that pages can handle both get and post requests (where appropriate).

To save you the trouble, I've published a version of the project that does all these things: https://github.com/jasonbrice/nodeling

If you clone/fork/download it, you should be able to import it straight into Eclipse:



Try it out and let me know if it worked in the comments section...

To Do

To make this more like a "real" project, we should do a few things:
  • Sanitize and validate inputs!
  • Add more querying tools.
  • Use reflection so that object properties are not hard-coded.
  • Package utils code as user modules.
  • Write unit tests for each page.
  • Usability: add breadcrumbs, etc.

Other odds & ends

In getting ready to push this to Github, I did a quick inventory of files, sizes, etc. to make sure I wasn't pushing any unnecessary junk up. So I ran:
npm ls

And saw that at the end of the output, there was this:

¦     +-- uglify-to-browserify@1.0.1
+-- sqlite3@2.1.15 extraneous

npm ERR! extraneous: sqlite3@2.1.15 C:\Users\bricej\workspace\nodeling\node_modules\sqlite3
npm ERR! not ok code 0

Huh? Sqlite is TOO a dependency! In fact, if I use npm prune to remove extraneous node_modules, then try and start the app, I get an error. So what gives?
It turns out that some piece in the install mechanism (npm? sqlite3's installer?) forgot to update the package.json. So in order to get node and npm to recognize it as such, I hand-updated the package.json to the following:

{
  "name": "nodeling",
  "version": "0.0.1",
  "private": true,
  "scripts": {
    "start": "node app.js"
  },
  "dependencies": {
    "express": "3.3.7",
    "jade": "*", 
    "sqlite3" : "*"
  }
}

This says that the app is dependant on a version (any version) of sqlite3.

Also, the "right way" to bundle the utility files like date_time.js and reflection.js would be to make them local node modules and add them as dependencies.

Additional Resources

Source code for this demo is here: https://github.com/jasonbrice/nodeling

This site is invaluable in understanding the node-sqlite3 wrapper: https://github.com/developmentseed/node-sqlite3

Eclipse and Nodeclipse help: http://www.nodeclipse.org/community/q-and-a

Everything you need to know about Node Package Manager: http://howtonode.org/introduction-to-npm

A great tutorial focused on Node.js and Express: http://evanhahn.com/understanding-express-js/

Another tutorial, using Expresso, Jade, MongoDB: http://dailyjs.com/2010/11/22/node-tutorial-4/

Another tutorial, using more or less same as above: http://blog.ijasoneverett.com/2013/03/a-sample-app-with-node-js-express-and-mongodb-part-1/

These tutorials are good to check out since they did similar things to this one, but show a few slightly different approaches. It's always helpful (to me) to understand how other people approach similar problems.

Monday, July 22, 2013

Fun with multi threading

This one will be short 'n sweet.

Background

It so happens that I have a project that involves decompiling a certain vendor's binary files into XML, then manipulating the data in that XML, the recompiling the XML back into the proprietary binary format.

The tool for de/re-compiling is provided by the vendor, and for our purposes let's say that it's a "black box" - we don't know how it does what it does, we just know that we give it an XML or binary file as input, and it spits out the opposite (sort of like that machine that could remove stars from the star-bellied Sneetches and vice versa).

In the cartoon version it was one machine. Use your imagination, people.

The Problem

The black box tool works Ok, but takes a second or two to spin up, do its thing, and produce a file of the expected output type. On my (pretty vanilla for 2013) dev laptop, it takes about 75 seconds to process 500 files.

Can we throw more threads at it?

As we all undoubtedly know, multi-threading is both the cause of, and solution to, all software performance problems (apologies to The Simpsons). Multi-threaded processes can be exceedingly difficult to debug, performance can actually be degraded if you do it wrong, and for some things it brings only modest performance gains at best.

So, will it work for my problem?

In theory it should. All of the complexity of what happens inside the black box is encapsulated away from my code and my poor brain. I just need to process lots of files, the more/faster the better.

This is the kind of thing the .NET Threadpool was made for.

The implementation

Well, I wish I could claim that I came up with the whole thing myself. But, standing on the shoulders of giants and all that, the oracle of sublime wisdom that is The Internet, and more importantly resources like StackOverflow, I found someone trying to solve basically the same problem.

But what about this bit?
I only used four threads in my example because that's how many cores I have. It makes little sense to be using 20 threads when only four of them can be processing at any one time. But you're free to increase the MaxThreads number if you like.
Good point. The implementation should make use of as many cores as it can, but any more than that doesn't do any good. So how do you do that? Oh StackOverflow, is there anything you don't know?

Ok. So here's my completed (and heavily blog-post-ified) code, with probably the bare minimum of flair:

using System;
using System.Collections.Generic;
using System.Text;
using System.Threading;
using System.IO;
using System.Diagnostics;
 
namespace AcmeCo.BigProject
{
    /*
     * As far as the rest of the app is concerned, XmlCompiler does all the work.
     * You just tell it what XML files to process, and let it worry about the rest.
     */
    public class XmlCompiler
    {
        List<string> Files;
 
        private static int CoreCount = 0;
 
        // No zero arg constructor: this class is pointless without files to process.
        public XmlCompiler(List<string> Files)
        {
            this.Files = Files;
 
            // Stolen from http://stackoverflow.com/questions/1542213/how-to-find-the-number-of-cpu-cores-via-net-c
            foreach (var item in new System.Management.ManagementObjectSearcher("Select * from Win32_Processor").Get())
            {
                CoreCount += int.Parse(item["NumberOfCores"].ToString());
            }
 
        }
 
        public void Process()
        {
            DoTheWork();
        }
 
        private static int MaxThreads = 8;
        private Semaphore _sem = new Semaphore(MaxThreads, MaxThreads);
 
        // Stolen from http://stackoverflow.com/questions/15120818/net-2-0-processing-very-large-lists-using-threadpool
        void DoTheWork()
        {
            MaxThreads = CoreCount;
            int ItemsToProcess = this.Files.Count;
 
            Console.WriteLine("Processing " + ItemsToProcess + " on " + CoreCount + " cores");
 
            int s = 0;
            for (int i = 0; i < ItemsToProcess; ++i)
            {
                Console.WriteLine("Processing " + i + " of " + ItemsToProcess + ": " + Files[i]);
 
                _sem.WaitOne();
 
                XmlCompileTarget target = new XmlCompileTarget(Files[i]);
 
                ThreadPool.QueueUserWorkItem(Process, target);
                ++s;
                if (s >= 19)
                    s = 0;
            }
 
            // All items have been assigned threads.
            // Now, acquire the semaphore "MaxThreads" times.
            // When counter reaches that number, we know all threads are done.
            int semCount = 0;
            while (semCount < MaxThreads)
            {
                _sem.Release();

                ++semCount;             }             // All items are processed             // Clear the semaphore for next time.             _sem.Release(semCount);         }         void Process(object o)         {             // do the processing ...             XmlCompileTarget target = (XmlCompileTarget)o;             target.Process();             // release the semaphore             _sem.Release();         }     }     // A "unit" of work... this class' job is to hand the file to the processing     // utility and raise an event when it's done.     public class XmlCompileTarget     {         private string file;         public XmlCompileTarget(string file)         {             this.file = file;         }         public void Process()         {             Compilefile();         }         public static event EventHandler<XmlProcessEventArgs> OnProgress = delegate { };         protected virtual void Progress(XmlProcessEventArgs e)         {             OnProgress.Raise(this, e);         }         private void Compilefile()         {             if (!System.IO.File.Exists(file))             {                 OnProgress(thisnew XmlProcessEventArgs(file, "File not found!"));                 return;             }             OnProgress(thisnew XmlProcessEventArgs(file, XmlUtilities.RunTool(@"Tools\XmlComp.exe"new FileInfo(file), null)));         }     }     // The processing utility runs the vendor's XML compiler and     // returns any output from that tool as a string.     public class XmlUtilities     {         public static string RunTool(string ExecutablePath, FileInfo FileInfo, string Arguments)         {             Process p = new Process();             ProcessStartInfo info = new ProcessStartInfo();             if (!File.Exists(ExecutablePath) || !FileInfo.Exists)             {                 Console.WriteLine("Error: File path not found - \r\n"                     + ExecutablePath + " exists == " + File.Exists(ExecutablePath) + "\r\n"                     + FileInfo + " exists == " + FileInfo.Exists);                 return null;             }             Console.WriteLine(Arguments);             info.FileName = ExecutablePath;             info.Arguments = string.IsNullOrEmpty(Arguments) ? "\"" + FileInfo.FullName + "\"" : Arguments;             info.UseShellExecute = false;             info.RedirectStandardOutput = true;             info.RedirectStandardError = true;             info.RedirectStandardInput = true;             info.ErrorDialog = true;             info.Verb = "runas";             info.WindowStyle = ProcessWindowStyle.Hidden;             p.StartInfo = info;             p.Start();             string output = p.StandardOutput.ReadToEnd();             string error = p.StandardError.ReadToEnd();             p.WaitForExit();             StringBuilder sb = new StringBuilder();             sb.Append(output);             return sb.ToString();         }     }     // Not much to see here...     public class XmlProcessEventArgs : EventArgs     {         private string filename;         private string output;         public XmlProcessEventArgs(string filename, string output)         {             this.filename = filename;             this.output = output;         }     }     // Ah, why this?     // Because it is a drag to continually have to add tons     // of thread-safe invocations on every last UI      // element that might need updating as a result of the event     // that was raised.     // Isn't it better to make the *event* notification thread-safe,      // and let UI elements be their merry little selves on their own     // merry little thread?     // But we digress... // Stolen from: http://stackoverflow.com/a/2150359/2124709     public static class ExtensionMethods     {         /// <summary>Raises the event (on the UI thread if available).</summary>         /// <param name="multicastDelegate">The event to raise.</param>         /// <param name="sender">The source of the event.</param>         /// <param name="e">An EventArgs that contains the event data.</param>         /// <returns>The return value of the event invocation or null if none.</returns>         /// <remarks>Usage: MyEvent.Raise(this, EventArgs.Empty);</remarks>         public static object Raise(this MulticastDelegate multicastDelegate, object sender, EventArgs e)         {             object retVal = null;             MulticastDelegate threadSafeMulticastDelegate = multicastDelegate;             if (threadSafeMulticastDelegate != null)             {                 foreach (Delegate d in threadSafeMulticastDelegate.GetInvocationList())                 {                     var synchronizeInvoke = d.Target as System.ComponentModel.ISynchronizeInvoke;                     if ((synchronizeInvoke != null) && synchronizeInvoke.InvokeRequired)                     {                         try                         {                             retVal = synchronizeInvoke.EndInvoke(synchronizeInvoke.BeginInvoke(d, new[] { sender, e }));                         }                         catch (Exception ex)                         {                             Console.WriteLine(ex.Message + Environment.NewLine + ex.StackTrace);                         }                     }                     else                     {                         retVal = d.DynamicInvoke(new[] { sender, e });                     }                 }             }             return retVal;         }     } }

Benchmarking


What kind of impact did all of that have? Let's compare:

   Operating System: Windows 7 Professional 64-bit (6.1, Build 7601) Service Pack 1 (7601.win7sp1_gdr.130318-1533)
          Processor: Intel(R) Core(TM) i7 CPU       Q 720  @ 1.60GHz (8 CPUs), ~1.6GHz
             Memory: 8192MB RAM
Available OS Memory: 8124MB RAM
          Page File: 8020MB used, 8224MB available

 * No multi-threading (serial processing of each XML file), 500 files. Processing time: 75 seconds. 
 * Using code above (multi-threading on 4 cores), 500 files. Processing time: 15 seconds.

Well, we've cut our processing time to 20% of what it was originally. What happens if we add more cores? I happen to have access to a bigger box here somewhere...

   Operating System: Windows 7 Professional 64-bit (6.1, Build 7601) Service Pack 1 (7601.win7sp1_gdr.110408-1631)
          Processor: Intel(R) Core(TM) i7 CPU       X 990  @ 3.47GHz (12 CPUs), ~3.5GHz
             Memory: 6144MB RAM
Available OS Memory: 6136MB RAM
          Page File: 1932MB used, 12392MB available

 * Using code above (multi-threading on 6 cores), 500 files. Processing time: 5 seconds.

Summary and Conclusions

I started with a bottleneck that required my code to run an external tool to process many thousands of files. After investigating running multiple instances of the external tool in parallel via multi-threading (and using hardware better suited for the job at hand), I was able to decrease the net runtime to ~ 7% of the time it did running sequentially in a single thread.

I can live with that.