Using geo-based data with SequelizeJS utilizing PostgreSQL and MS SQL Server in Node.js

Using geo-based data with SequelizeJS utilizing PostgreSQL and MS SQL Server in Node.js

I’m currently building an Angular 2 sample application, which will use location-based data. The app uses the browser’s navigator.geolocation feature to obtain the current position and send it to a server which returns a list of chat messages in a given radius around the sent coordinate. As a German student, you may know this from the app Jodel. For sample purposes only, the backend of the app can either use PostgreSQL or Microsoft SQL Server (MSSQL) which will be abstracted with the amazing SequelizeJS library. The app and the backend will later be open-sourced, so you can take a look at it yourself.

I’m pretty sure all the information in this blog post can be found elsewhere (and even in more detail). But it took me quite an amount of time to get it up and running. So I want to give you a condensed overview about it.

The intention of this blog post is to show the creation of a simple backend with the two different database engines. The code shown in this post is also hosted at Github. There is no talk about the Angular 2 frontend in this article, though.

Preparation

While MS SQL Server has a built-in Geographic Information System (GIS), PostgreSQL does not. Fortunately, PostgreSQL has an extension called PostGIS to support spatial- and geo-based data. Since I’m using a Mac for development, installing PostGIS is very easy if you use Postgres.app. It has integrated PostGIS support. If you don’t use the app, you need to refer to the PostGIS documentation for proper installation. After installing PostGIS you need to enable the extension for the database where you want to use it by executing CREATE EXTENSION postgis; against the database. That’s all you need to do.

Schema design

Both PostgreSQL and MSSQL support two different data types for spatial and geo-based data: geometry and geographic. Geometry data will be calculated on a planar plane. Geographic data, however, will be calculated on a sphere, which is defined by a Spatial Reference System Identifier (SRID, more on that below). Take a look at the following two images to see the difference.

Planar Coordinate System

Spherical Coordinate System

As you can see, within the planar coordinate system, a line would be drawn straight from New York to Berlin, resulting in not so accurate calculation results. As we all know the earth is not plane, therefore the spherical coordinate system takes that into account and calculates distances on a sphere, leading to more accurate results. Hopefully you don’t use a planar system to calculate the fuel for your airplane. ;-) In case of pure performance, geometry-based data will be faster, since the calculations are easier.

Some paragraphs above I mentioned a mandatory SRID when doing calculation on a spherical coordinate system. It is used to uniquely identify projected, unprojected or local spatial coordinate system definitions. In easier words, it identifies how your coordinates are mapped to a sphere where they are valid (e.g. whole world, or just a specific country) and which units they produce in case of calculations (kilometers, miles, …). For example, EPSG:4326/WGS84 is used for the worldwide GPS satellite navigation system, while EPSG:4258/ETRS89 can be used for calculations in Europe. It is also possible to convert data from one SRID into another SRID.

Before you start doing your schema or table design, you should consider whether you want to use geometry or geography. As a very simple rule of thumb: If you don’t need to calculate distances across the globe or you have data which represents the earth, just go with geometry. Otherwise take geography into account.

SequelizeJS and GIS

GIS support for SequelizeJS is, on the one hand, supported since 2014’ish. On the other hand, unfortunately, it is only implemented for PostgreSQL and PostGIS. There is a discussion going on for implementing a broader support for GIS. Another drawback is that only geometry is currently supported. If you need geography support, then SequelizeJS today can’t help you since it is not implemented as a data type at all. Nevertheless, for my little sample it is completely OK to go with geometry data, even when doing location-based search since the radius will be small enough to get good results. Actually, we can use SequelizeJS for both PostgreSQL and MSSQL! The next paragraphs explain what you need to do to achieve this.

Prepare SequelizeJS

For the sample backend I’m using Node.js v5.4.0. At the very first, we need to install the necessary dependencies. A simple npm i sequelize pg tedious  is what we need. sequelize  will install SequelizeJS. pg  is the database driver for PostgreSQL and tedious  the one for MSSQL.

Side note: There are official MSSQL drivers from Microsoft (here and here), but they are currently for Windows only.

Create the database connector class

Let’s start by creating a very simple and minimalistic class Database in ECMAScript 2015, which connects to the database and creates a model:

'use strict';

const Sequelize = require('sequelize');

function Database() {
    let sequelize;
    let dialect;
    let models = {};

    this.models = models;
    
    this.getDialect = function () {
        return dialect;
    };

    this.initialize = function (useMSSQL) {
        sequelize = useMSSQL ? connectToMSSQL() : connectToPostgreSQL();

        dialect = sequelize.connectionManager.dialectName;

        initializeModels();

        return syncDatabase();
    };

    function connectToMSSQL() {
        return new Sequelize('SampleDatabase', 'SampleUser', 'SamplePassword', {
            host: '10.211.55.3',
            dialect: 'mssql',
            dialectOptions: {
                instanceName: 'SQLEXPRESS2014'
            }
        });
    }

    function connectToPostgreSQL() {
        return new Sequelize('SampleDatabase', 'SampleUser', 'SamplePassword', {
            host: 'localhost',
            dialect: 'postgres'
        });
    }

    function initializeModels() {
        const SampleModel = sequelize.define('SampleModel', {
            id: {
                autoIncrement: true,
                type: Sequelize.INTEGER,
                primaryKey: true
            },
            point: {
                type: Sequelize.GEOMETRY('POINT'),
                allowNull: false
            }
        });

        models[SampleModel.name] = SampleModel;
    }

    function syncDatabase() {
        return sequelize.sync();
    }
}

module.exports = new Database();

Let’s dissect this code – first things first: Import Sequelize, so we can use it. Then we define the Database  class with a public field called models  and two public functions calledgetDialect and  initialize . The public field will hold our sample model, so we can use it later. The getDialect  function returns the used dialect either postgres  or mssql . The initialize  function is used to initialize and connect to the database. Within, we check if we want to connect to PostgreSQL or MSSQL. After connecting, we create a SampleModel  with an auto-incrementing primary key id  and a point  of type GEOMETRY(‘POINT’) . SequelizeJS supports different kinds of geometries, but that depends on the underlying database engine. With GEOMETRY(‘POINT’) we tell the database engine, we only want to store geometry of type point. Other valid kinds would be LINESTRING  or POLYGON . Or you can omit the type completely to use different kinds within the same column. At last, we store our model in our public field, so it is accessible via this.models.SampleModel  later on. Last, but not least, we use syncDatabase()  which calls sequelize.sync()  and returns a Promise . sequelize.sync()  will create the necessary tables for your defined models in this case.

*Side note: *All SequelizeJS methods which communicate with the database will return a Promise .

The module get’s exported as an instance/singleton.

Create the SampleService adapter

Next is a service class which will use our database and model to create entities and read data. The service will be a wrapper around the actual implementations for the different database engines and provides access methods which could be used by an user interface or Web API to access the data.

'use strict';

const SampleServiceMSSQL = require('./sampleService.mssql'),
        SampleServicePostgreSQL = require('./sampleService.postgres');

function SampleService(database) {
    const adapter = database.getDialect() === 'mssql'
            ? new SampleServiceMSSQL(database.models.SampleModel)
            : new SampleServicePostgreSQL(database.models.SampleModel);

    this.create = function (latitude, longitude) {
        // Do some input parameter validation

        const point = {
            type: 'Point',
            coordinates: [latitude, longitude]
        };

        return adapter.create(point);
    };

    this.getAround = function (latitude, longitude) {
        // Do some input parameter validation
        return adapter.getAround(latitude, longitude);
    };
}

module.exports = SampleService;

At first, we import two classes: SampleServiceMSSQL  and SampleServicePostgreSQL , since we need different approaches for handling our geometry data. Then we define a SampleService  which has a dependency to the database. Notice at the bottom that we export the class and not an instance. Remember, that database.initialize()  will return a Promise  when everything is set up. So we will construct the service later, when the Promise  has been resolved.

Within the class we check which underlying database engine we have. In case of MSSQL we construct SampleServiceMSSQL otherwise SampleServicePostgreSQL . Both of them get the model as their first argument. Same reason here: That ensures a resolved database.initialize()  Promise.

The class itself defines two methods. The first create()  will create a new entry in the database by the provided latitude  and longitude . To do so, a point  object is created with a property type  of value ‘Point’  and a property coordinates  containing an array with latitude  and longitude . This format is called GeoJSON and can be used throughout SequelizeJS. Then we call the adapter’s create  method.

Exactly the same is done with the second method getAround() . The purpose of this method will be to get all points in a radius around the given latitude  and longitude .

Please note, that this sample lacks any input validation by intention due to this blog posts scope.

Now we have a database and service class which functions as an adapter to the concrete implementations. Let’s build the implementations for PostgreSQL and MSSQL!

Implement the SampleServicePostgreSQL adapter class

We start by building the SampleServicePostgreSQL class:

'use strict';

function SampleServicePostgreSQL(model) {
    this.create = function (point) {
        return model.create({
            point: point
        });
    };

    this.getAround = function (latitude, longitude) {
        const query = `
SELECT
    "id", "createdAt", ST_Distance_Sphere(ST_MakePoint(:latitude, :longitude), "point") AS distance
FROM
    "SampleModels"
WHERE
    ST_Distance_Sphere(ST_MakePoint(:latitude, :longitude), "point") < :maxDistance
`;
        
        return model.sequelize.query(query, {
            replacements: {
                latitude: parseFloat(latitude),
                longitude: parseFloat(longitude),
                maxDistance: 10 * 1000
            },
            type: model.sequelize.QueryTypes.SELECT
        });
    };
}

module.exports = SampleServicePostgreSQL;

This is our adapter for PostgreSQL. The implementation of the create  method is really straightforward. Every SequelizeJS model contains a method create  which will insert the model data into the underlying database. Due to the support of PostGIS we can simply call model.create(point)  and let SequelizeJS take care of correctly inserting our data.

Let’s take a look at the getAround  method. As mentioned above, SequelizeJS has support for PostGIS. Unfortunately, it is a very basic support. It supports inserting, updating and reading, but no other methods like ST_Distance_Sphere , or ST_MakePoint  via a well-defined API abstraction. But according to this Github issue it is currently being discussed.  By the way, the mentioned methods are open standards from the Open Geospatial Consortium (OGC). We will see those methods later again, when implementing the MS SQL Server adapter.

Back to the getAround  method. First we declare our parameterized query. We select the id , the createdAt  and calculate a distance . OK, wait. What’s happening here? We don’t have a createdAt  property in our model, do we? Well, we have, but not an explicit one. Per default, SequelizeJS automatically creates an additional createdAt  and updatedAt  property for us and keeps track of them. SequelizeJS wouldn’t be SequelizeJS, if you can’t change this behavior.

What about the ST_Distance_Sphere(ST_MakePoint(:latitude, :longitude), “point”) AS distance? We use ST_MakePoint to create a point from our latitude  and longitude  parameters. Then we use the result as the first parameter for ST_Distance_Sphere. The second parameter “point”  references our table column. So for every row in our table SampleModels (SequelizeJS automatically pluralizes table names by default) we calculate the spherical distance (although it is a planar geometry object) between the given point and the one in our column. Be careful here and don’t get confused! ST_Distance_Sphere  calculates the distance with a given earth mean radius of 6370986 meters. If you want to use a real Spheroid according to the SRID mentioned above, you need to use ST_DistanceSpheroid.

The WHERE  part of the query will be used to only select data which is within a provided radius represented by the named parameter maxDistance. Last, but not least, we run this query against our PostgreSQL by calling model.sequelize.query . The first parameter is our query , the second is some options. As you may have noticed, we used named placeholders in our query. Therefore, we use the replacements  objects to tell SequelizeJS the values for placeholders. latitude  and longitude  are self-explanatory. maxDistance  is set to 10 kilometers, so we only get points in the given radius. With the type  property we set the type of the query to a SELECT  statement.

So far, so good, our PostgreSQL adapter is done. Let’s move on to the MSSQL adapter!

Implement the SampleServiceMSSQL adapter class

The code for the SampleServiceMSSQL  class is the following:

'use strict';

function SampleServiceMSSQL(model) {
    this.create = function (point) {
        const query = `
INSERT INTO [SampleModels]
    (
        [point],
        [createdAt],
        [updatedAt]
    )
VALUES
    (
        geometry::Point(${point.coordinates[0]}, ${point.coordinates[1]}, 0),
        ?,
        ?
    )`;
        
        return model.sequelize.query(query, {
            replacements: [
                new Date().toISOString(),
                new Date().toISOString()
            ],
            model: model,
            type: model.sequelize.QueryTypes.INSERT
        });
    };

    this.getAround = function (latitude, longitude) {
        const maxDistance = 10 * 1000;
        const earthMeanRadius = 6370986 * Math.PI / 180;

        const query = `
SELECT
    [id], [createdAt], [point].STDistance(geometry::Point(?, ?, 0)) * ? AS distance
FROM
    [SampleModels]
WHERE
    [point].STDistance(geometry::Point(?, ?, 0)) * ? < ?
        `;

        return model.sequelize.query(query, {
            replacements: [
                latitude,
                longitude,
                earthMeanRadius,
                latitude,
                longitude,
                earthMeanRadius,
                maxDistance
            ],
            type: model.sequelize.QueryTypes.SELECT
        });
    };
}

module.exports = SampleServiceMSSQL;

Let’s go through this, step by step. Due to the complete lack of geometry support in MSSQL we need to do everything manually now. Take a look at the create  method. We start with defining our INSERT query , and insert the values: point , createdAt  and updatedAt . If we execute a raw query we need to take care about setting the createdAt  and updatedAt  values. For the value of point  we use geometry::Point(${point.coordinates[0]}, ${point.coordinates[1]}, 0) . If you are not familiar with JavaScript’s templated strings this may hurt your eyes a bit. The syntax ${expression}  simply inserts the value into the string. geometry::Point()  is MSSQL’s equivalent to the ST_MakePoint  mentioned above with one difference. It wants to have a third parameter, the SRID. Since we don’t use it here we simply can use 0.

You may have noticed that we don’t use named parameters here. SequelizeJS automatically recognizes everything that is prefixed with a colon. So it would try to replace :Point  with a named parameter. Fortunately, the replacements objects can be an array as well and replaces all the question marks with the values defined in the order of their appearance. Additionally we supply a property model  with the value of our model . This tells SequelizeJS to automatically map the result of the INSERT  statement to our model. Finally, we set the kind of the query to INSERT .

Now to our last method getAround . It is basically the same as the one from the PostgreSQL adapter, but since we don’t use a SRID for calculation, MS SQL Server will calculate on a plane. Thats why we multiply the result with the earth mean radius to get the distance in meters. Note: This is slightly less accurate than the PostgreSQL version of calculation with ST_Distance_Sphere .

Wow. Take a deep breath, we have finished the database and service classes. The last thing to do is a bit of orchestration to try everything out!

Orchestration

Create a new index.js  file with the following content:

'use strict';

const database = require('./database'),
    Service = require('./sampleService');

let service;

database.initialize(false)
    .then(() => {
        service = new Service(database);

        return service.create(49.019994, 8.413086);
    })
    .then(() => {
        return service.getAround(49.013626, 8.404480);
    })
    .then(result => {
        console.log(result);
    });

Absolutely straight forward. Import the database and the SampleService  class. Then initialize the database with PostgreSQL connection. After initialization, create a new Service  with the database  and insert a coordinate. Then call service.getAround()  with another coordinate and print the result to the console. To run the sample app, open a terminal where you index.js  is located and execute node index.js . You should now see the distance between the Schloss Karlsruhe and the Wildparkstadion which looks like this:

Sample Output

SequelizeJS outputs the executed query per default (with the replaced values, which means you can easily execute the statement manually and take a look at its execution plan for optimizing. How awesome!). If you don’t like it, change it. ;-)

At the bottom of the output, right after the SQL statement, is our result (PostgreSQL):

[
    {
        "id": 3,
        "createdAt": "Fri Jan 08 2016 09:03:07 GMT+0100 (CET)",
        "distance": 1185.92294455
    }
]

The same sample executed with MS SQL Server results in:

[
    {
        "id": 4,
        "createdAt": "Fri Jan 08 2016 09:11:44 GMT+0100 (CET)",
        "distance": 1190.4306593755073
    }
]

As you can see, there is a slight distance difference (approx. 5 meters) which could increase, if the distances get greater. Since the sample app will only make use of data in a 10 km radius, it is completely ok.

If you want to download this sample, head over to Github.