Sunday, February 22, 2015

Blazing fast JSON query results with sockets

TL;DR

Clone this repository and change the database connection string to kickstart a Socket based CRUD app: https://github.com/jasonbrice/stealthis_sockets

Um Wut


For today's funfest, we'll be serving up some relational data over the web via websockets, and using a veritable buzzword-a-palooza of hawt toolz like: NodeJS (JavaScript engine), socket.io (websockets), Postgresql (database), and BookshelfJS/Knex (ORM layer).

Why would we do such a thing?

Let's imagine that you are a developer responsible for creating a web-based API that handles basic CRUD operations for a relational data store. Further suppose that the 'web based' part of the requirement means that your query language and results format are both JSON.

Let's make one final assumption, that you think writing endless layers of DAO objects and repeatedly testing for and debugging column/property mismatches feels a bit like going to the dentist with a sock full of fire ants.

Still Reading?


Let's get started. You're going to need NodeJS. Don't have NodeJS? This handy guide will help you.

First, pick an empty directory somewhere and create a text file called package.json. Make the contents of package.json the following:

{
  "name": "MahApp",
  "version": "0.0.0",
  "private": true,
  "scripts": {
    "start": "node app.js"
  },
  "dependencies": {  }
}

Next create a text file called app.js. The contents of app.js are these:


console.log('steal this tech!');


To verify that you installed NodeJS correctly, navigate to the project root directory with a command-line tool, and run the command:

npm start


You should see output similar to:

$ npm start

> MahApp@0.0.0 start c:\source\code\blog
> node app.js

steal this tech!


Congrats, you're almost ready to apply for the CTO gig at Google. Just a few more bits to go.

Next we're going to install some dependencies. First, let's make our new app a proper http/ws server.


npm install --save express express.io && npm list


Hopefully when this is done you see a dependency tree with express and express.io at the top levels.

BIG DISCLAIMER: depending on your environment (especially a Windows machine without various developer tools) this command might create some fireworks. It's beyond the scope of this post to set up all the dependencies you need for node-gyp or a C++ toolchain, but this is an easily solved problem with a bit of Google-Fu.

Once you get express and express.io installed, you're ready to start some servitude.

First, let's beef up our app.js to listen for http and ws connections. app.js should now look like:


app = require('express.io')()

app.http().io()

app.get('/', function(req, res) {
    res.sendfile(__dirname + '/index.html')
})

app.io.route('ready', function(req) {
    req.io.emit('status', 'A full-on websocket is at your service (' + new Date().toString() + ')');
})

var port = process.env.PORT || 3101;

console.log("steal this tech is listening on port " + port);
app.listen(port);


We've done a couple of things here: initialized a listener on port 3101, and told the http server to serve a file called index.html when someone makes an http request at the root. Let's create an HTML file that will then make a socket connection to the server. This piece is more or less stolen outright from the express.io examples


<html>
<script src="/socket.io/socket.io.js"></script>
<script>
  io = io.connect()

  // Emit ready event.
  io.emit('ready')

  // Listen for the status event.
  io.on('status', function(data) {
    document.body.innerHTML = data;
  })
</script>
<body></body>
</html>

Now, start the app again with 'npm start' and navigate to http://localhost:3101 with a modern browser. Preferably Chrome. Do you see a message? Good! Here's what happened:

  1. The app started up and began listening for both http and ws protocol requests on port 3101.
  2. By navigating to the root with a browser, you sent an http request, to which express replied by sending you the index.html
  3. Then, the index.html page loaded a javascript socket library, created a socket connection to express.io, and fired an event ('ready') to let the server side know it was ready
  4. The server side heard the event and fired its own event ('status') with a message payload
  5. The client (browser) side received the server's status event, and made the contents of the message payload visible. 

Cool!

In case you didn't already know this amazing tidbit, a websocket is a live connection between your browser and a server. Unlike "stateless" http, where a request and a response are sent (along with headers, cookies, and other attempts to mimic statefulness) for every request, even if it's just a teeny ajax call to update a few bytes of data. With websockets, transferring a few bytes of data only costs you a few bytes of data (nitpickers - I know this is a bit of an over simplification. It's a blog, not a PhD thesis).

Let's make this a bit more useful than just transferring cute messages between a client and a server.

Part II: You Know I'm All About the (Data)Base


For the next part of the show, we're going to create a database with a few tables. For the sake of following along, you'll either need to download/install/configure Postgresql and set up a database, or, do what I'm going to do and create a free account on elephantsql. Either way, I'm going to gloss over the initial setup of the database part other than to say by the end of it you should have a new database/schema capable of being connected to with a Postgresql connection string.

Also, if you don't have one already, you might want a graphical tool for developing and visualizing the database objects we'll be creating. Posgresql comes with a pretty good one (pgAdmin) which you can download as a standalone package if you don't want the entire Postgres footprint on your machine.

Ok, so following the ElephantSQL setup guide, I've got me a free hosted database I can access from my local machine.

Does this admin client make me look phat?

We're going to set up a very, very simple small database with three tables: a person, a login, and an address. The point is not to show off data normalization techniques, rather let's do something that could be the basis for an authentication system.

Putting the ERD in NERD


The DDL required to generate this structure can be found here.

Here comes the CRUD

In the bad old days, Conestoga Wagons hauled wooden barrels of SQL around that were carefully unpacked and hand-wired directly to CGI scripts, and later, VB.NET asp pages. These dark days eventually gave way to ORM frameworks like Hibernate, which let you replace pain-staking hard-coded SQL with pain-staking hard-coded XML. And there was much CASCADE confusion, and many Lazy Loading Exceptions were thrown, and lo there came the OpenSessionInView filter, and Spring eventually came along and washed away many of our tears.

These days, even with Microsoft's most bleeding edge version of Entity Framework, you still gotta write a metric boatload 'o code to do a full set of Create Read Update Delete operations from a web page to the database and back.

One day that will seem as dumb as leeches and phlogiston theory, but for now let's at least console ourselves with these shining beacons of sanity: BookshelfJS and Knex.

Back at our command line, let's install them:


npm install --save knex bookshelf pg

Knex is the "low level" library that will be generating the SQL statements, and Bookshelf is the higher ORM library that will let us identify the relationships between objects and tables with far, far less coding than you would need with Spring / Hibernate / Entity Framework. PG is the particular dialect we've chosen, but you could swap that out for mysql / sqlite / etc.

BookshelfJS Models & Collections

Let's get to cruddin'! At the root level of the app, create two new directories: models and collections. We'll put our BookshelfJS models in - you guessed it. We'll refer to arrays of them with collections.

For our first model, let's create a person. Create a new file called model/person.js. It should look like this:


/**
 * This is the model representation of the person table. It represents a single person.
 */

var Bookshelf = require('bookshelf').DB;

exports.model = Bookshelf.Model.extend({
  tableName: 'stealthis.person',
  idAttribute: 'id'
});

That's it. That's our model "glue" code. Now a collection. Create a new file called collections/people.js. It should looks like this:


/*
 * This represents a collection of all entries in the person table. We will use this for our list method.
 */

var Bookshelf = require('bookshelf').DB;

var Person = require("../models/person").model;

exports.collection = Bookshelf.Collection.extend({
  model: Person,
  tableName: "stealthis.person"
});

Now we're ready to connect Bookshelf to Knex, and then to a realtime socket.io route. The app.js should now look like this:



app = require('express.io')()

app.http().io()

app.get('/', function(req, res) {
  res.sendfile(__dirname + '/index.html')
})

var port = process.env.PORT || 3101;

app.io.route('ready', function(req){
  req.io.emit('status', 'A full-on websocket is at your service (' + new Date().toString() + ')');
})

// New code below! 

// Set up a connection to our database with a new Knex instance
var pg = require('knex')({
  client: 'pg',  debug: true,  connection: "postgres://****@babar.elephantsql.com:5432/mfifoqjd"  ,
  pool: {    min: 2,    max: 5  }
});

// Bring in Bookshelf!
var Bookshelf = require('bookshelf');
Bookshelf.DB = Bookshelf.initialize( pg );

// Make the app aware of our model
var People = require('./collections/person').collection;
var Person = require('./models/person').model;

// Define some realtime routes
app.io.route('person', {
    create: function(req) {
        // create a person
 Person.forge(req.data).save().then(function(person){
          req.io.emit('response', person)
        });
    },
    list: function(req) {
  // list all people
         new People().fetch({}).then(function (collection) {
           req.io.emit('response', collection)
         });
    },
});

// End of new code!

console.log("steal this tech is listening on port " + port);
app.listen(port);

Ok. That was a lot. We created a new Knex instance, then initialized Bookshelf with it. Then we told socket.io how to do realtime routing with create and list Person methods, inside which we made Bookshelf calls that do the lifting and send SQL via Knex.

Next comes mind === 'blown'


Start the app (npm start). Point a browser at your app (http://localhost:3101). Open a JavaScript console. Type the following in the console:


io.emit('person:create', {firstname: 'Ray', lastname: 'Charles'});

You should see something like:



Now. Open your Postgres SQL client and type: SELECT * FROM stealthis.person

You should see:

Are you feeling awesome? You should be. You just sent a JSON object from your freakin' console to a real actual hardcore database with like a few lines of JavaScript.

But we're not done getting more Awesomer. Not yet.

Part III: Rocking the Proverbial Casbah

At this point you could just download the finished code, tweak it, and run it yourself. But for the curious, here are a few final problems to solve.

1) We need to finish wiring up relationships between objects. Let's add the Login and Address classes, and make sure they all have the right CRUD methods.
2) Our app.js is starting to get crowded. If we add lots and lots of realtime routes, it'll be a big mess to understand and debug. We don't want a single misplaced semicolon to cause our app not to start, so let's break routes out into their own files.
3) Let's add a hashing library so that we're not storing passwords in cleartext.
4) We need to update index.html to be able to give us status updates.

So. Create a directory called realtime_routes. In realtime routes create a file called person.js. That file should look like this:


var People = require('../collections/person').collection;
var Person = require('../models/person').model;


module.exports = {

    list: function list(req) {
        new People().fetch({}).then(function (collection) {
                req.io.emit('response', {status: 200, results: collection})
            }
        )},

    select: function select(req) {
        Person.forge({id: req.data.id}).fetch({}).then(function (person) {
                req.io.emit('response', {status: 200, results: person})
            }
        )},

    create: function(req) {
        Person.forge(req.data).save().then(function(person){
            req.io.emit('response', {status: 200, results: person})
        });
    },

    update: function(req) {
        if(req.data.id){
            Person.forge(req.data).save().then(function(person){
                req.io.emit('response', {status: 200, results: person})
            });
        }else
        {
            req.io.emit('response', {status: 500, results: {error: "Could not update user", reason: "id was not specified"}});
        }
    }
};

And we can reduce the new code in app.js to:


// New code below! 

// Set up a connection to our database with a new Knex instance
var pg = require('knex')({
  client: 'pg',  debug: true,  connection: "postgres://****@babar.elephantsql.com:5432/mfifoqjd"  ,
  pool: {    min: 2,    max: 10  }
});

// Bring in Bookshelf!
var Bookshelf = require('bookshelf');
Bookshelf.DB = Bookshelf.initialize( pg );
// Reference the person route in a different file...
app.io.route('person', require('./realtime_routes/person'));

// End of new code!
// Le snazzee, n'est-ce pas?

Ok, a few more things for this demo then it's all yours. We should:

  1. Add realtime_routes for login and address, as well as require directives for them in the main app.
  2. Add an authenticate method for our Login model

Add Realtime Routes

Create realtime routes for address and login. You can copy/paste and find/replace using the working person.js route.

BUT, for login, let's delete all but the create method. And, let's add an authenticate method:


 authenticate: function(req)
    {
        if(req.data.email && req.data.password)
        {
            req.data.password = sha512(req.data.password).toString('hex');
            Login.forge( {email: req.data.email, password: req.data.password}).fetch({})
                .then(function (login) {

                login = login.toJSON();

                if(login.password) delete login.password;

                Person.forge({id: login.personid}).fetch({}).then(function (person) {
                    req.io.emit('response', {person: person, login: login});
                })
            }).catch(function(error){
                console.log(error);
                req.io.emit('response', 
                    {status: 500, error: 
                       {message: "Could not complete request", 
                        reason: "Credentials not found or database is down"}});
            });
        }
        else
        {
            req.io.emit('response', 
               {status: 400, error: 
                 {message: "Malformed request", reason: "Missing email or password"}});
        }
    },

We're also going to modify the login create method to scramble the password using the sha512 library. (DISCLAIMER - there are many other things we need to consider to make this app secure, but let's at least hash the password to keep our database safe from kids in treehouses.)

Now login create looks like this:


create: function(req) {

        if(req.data.password){
            req.data.password = sha512(req.data.password).toString('hex');
        }

        Login.forge(req.data).save().then(function(login){
            login = login.toJSON();
            if(login.password){ delete login.password; }
            req.io.emit('response', {status: 200, login: login})
        }).catch(function(error){
            console.log(JSON.stringify(error));
            req.io.emit('response', 
                {status: 500, error: 
                   {message: "Could not create login", reason: error.detail}});
        });
    }

You'll need to use npm to install sha512, and add a line at the top of login.js to require it.

Test that the login route is working by creating a login:

io.emit('login:create', {email: 'ray@atlanticrecords.com', password: 'Ray', personid: 3});

If it worked, you'll get a login object and a status object back. Otherwise, you'll get an error with a (hopefully) helpful message. (Make sure to swap out the personid with the correct primary key of the person record.)

Then, try and authenticate using your newly created login:

io.emit('login:authenticate', {email: 'ray@atlanticrecords.com', password: 'Ray'});


Next Steps

If you want to take this app further, here are some things you could do:


  1. Wire up the rest of the object model, and add to it. The completed project has a relationship between address and person but we didn't cover it here. 
  2. Worry about validation and security. You don't want just anybody making updates to your database.
  3. Create unit tests to automate the task of validating your mappings
  4. Move the entity relationship handling to Bookshelf. Right not we're giving the route an outsized role (e.g., the method to get a person from a login is handed-coded... Bookshelf can handle that for us! Maybe a follow-on blog will tackle this.)

And don't forget - if you like it, STEAL IT! https://github.com/jasonbrice/stealthis_sockets