In this tutorial we'll be building a simple uptime monitor in Node.js that posts status changes to Slack. If you're not familiar with what an uptime monitor is, in its simplest form, it pings services at a specified interval and takes a look at a couple of things such as the mean response time and response status of a service.

For our purposes, we'll specify a list of services we want to monitor and check for 2 things:

  • The mean response time over the last 3 responses
  • The response status code to see if the service is even functional or not (whether it's a 200 or not)

Based on those 2 items we can define 3 states each of our services can be in:

  • OPERATIONAL: the service's mean response time is below the threshold we specified and the response status code is a 200
  • DEGRADED: the service is functioning (status code 200) but the mean response times are above the threshold we set
  • OUTAGE: the service responded with a status code that is not a 200

Project Setup

First things first. Let's go ahead and set up the required files and install our dependencies.

Node Monitor Project Setup

  • index.js will contain the bulk of our code to ping the services and post to Slack
  • services.js will contain a list of the services we want to monitor
  • .env will contain our Slack's Webhook URL

Go ahead and create and empty directory, then cd into it:

mkdir node-uptime-monitor && cd node-uptime-monitor

Create a .env file that contains the following:

SLACK_WEBHOOK_URL=

Later, we'll be creating an Incoming Slack Webhook URL which will be used to post status updates to Slack. It may seem overkill to create a .env file and install an additional dependency when we can just hard code the webhook URL but it's good practice to keep environment-specific configuration and sensitive information out of source control.

Initialize a package.json and install our dependencies:

npm init -y
npm install --save dotenv request

dotenv will be used to load up the project's configuration from the .env file into the Node process's environment variable. In our case, only process.env.SLACK_WEBHOOK_URL.

request will be used to make HTTP GET requests to the services and provide timing information.

Next, let's create the services.js file and specify some services we wish to monitor:

module.exports = [
  {
    url: 'http://localhost:3000', // URL of service we'll be pining
    timeout: 200 // threshold in milliseconds above which is considered degraded performance
  },
  {
    url: 'http://localhost:3001/health',
    timeout: 300
  }
]

Let's put the skeleton of the index.js file together which we'll populate in the next sections:

// index.js

require('dotenv').config() // load the variables from the .env file

const request = require('request')
const services = require('./services.js') // list of services to monitor

Writing the Ping Function

The ping function is simple. It will:

  1. take a url and cb (callback) as parameters
  2. use the request library to make a GET request to the specified url
  3. return the response time computed by the request library or an OUTAGE status if the response is not a 200

In code, this looks something like:

// index.js

...

const pingService = (url, cb) => {
  request({
    method: 'GET',
    uri: url,
    time: true
  }, (err, res, body) => {
    if (!err && res.statusCode == 200) {
      // we'll use the time from the point we try to establish a connection with
      // the service until the first byte is received
      cb(res.timingPhases.firstByte)
    } else {
      cb('OUTAGE')
    }
  })
}

Setting Up the Monitoring

Now for the meat of the logic - the monitoring. Before we jump into the code, let's break down what we want to achieve.

For each service in services.js:

  • Call setInterval to ping the service every 5 minutes
  • If the pingService function returns OUTAGE:
    • log a service outage
  • Else:
    • compute the average of last 3 round trips
      • If average response time > timeout threshold, log a performance degradation

At it's core, that's all there is to it. We will also need to add some safe guards to ensure we only post to Slack when the service's status changes state (i.e.: going from OPERATIONAL to OUTAGE) to avoid spamming the channel.

// index.js

...

const pingInterval = 5*1000*60 // 5 minutes
let serviceStatus = {}

services.forEach(service => {
  serviceStatus[service.url] = {
    status: 'OPERATIONAL', // initialize all services as operational when we start
    responseTimes: [], // array containing the responses times for last 3 pings
    timeout: service.timeout // load up the timout from the config
  }

  setInterval(() => {
    pingService(service.url, (serviceResponse) => {
      if (serviceResponse === 'OUTAGE' && serviceStatus[service.url].status !== 'OUTAGE') {
        // only update and post to Slack on state change
        serviceStatus[service.url].status = 'OUTAGE'
        postToSlack(service.url)
      } else {
        let responseTimes = serviceStatus[service.url].responseTimes
        responseTimes.push(serviceResponse)

        // check degraded performance if we have 3 responses so we can average them
        if (responseTimes.length > 3) {
          // remove the oldest response time (beginning of array)
          responseTimes.shift()

          // compute average of last 3 response times
          let avgResTime = responseTimes.reduce((a, b) => a + b, 0) / responseTimes.length
          let currService = serviceStatus[service.url]

          if (avgResTime > currService.timeout && currService.status !== 'DEGRADED') {
            currService.status = 'DEGRADED'
            postToSlack(service.url)
          } else if (avgResTime < currService.timeout && currService.status !== 'OPERATIONAL') {
            currService.status = 'OPERATIONAL'
            postToSlack(service.url)
          }
        }

      }
    })
  }, pingInterval)
})

Posting Status Changes to Slack

For the last piece of the puzzle, we need to post the status changes in the services to Slack. Thankfully, Slack has an app called Incoming WebHooks that does the bulk of the work for us.

Head on over to the Incoming WebHooks Slack app. If you're signed in to your Slack Workspace you should see an Add Configuration button. Hit it!

Incoming WebHooks Slack Add Configuration

Select or create a new channel then hit the Add Incoming WebHooks integration

Incoming WebHooks Slack Select Channel

Grab the WebHook URL and paste it into the .env file we created earlier, which would look like so:

SLACK_WEBHOOK_URL=https://hooks.slack.com/services/XXXXXX/YYYYYY/XXXXXXXXXXXX

Incoming WebHooks Slack Edit Configuration

We can now POST data to this WebHook URL and Slack will automatically post the text to the channel we selected earlier.

All that's left is to create the function which we'll use to send the POST request to the WebHook URL with the payload:

// index.js

...

const postToSlack = (serviceUrl) => {
  let slackPayload = {
    text: `*Service ${serviceStatus[serviceUrl].status}*\n${serviceUrl}`
  }

  request({
    method: 'POST',
    uri: process.env.SLACK_WEBHOOK_URL,
    body: slackPayload,
    json: true
  }, (err, res, body) => {
    if (err) console.log(`Error posting to Slack: ${err}`)
  })
}

The function is rather simple. It looks up the status of the service in the serviceStatus object given it's URL and POSTs to the status to the WebHook URL.

Incoming Webhook Slack Alert Uptime Monitor

Putting it All Together

That's it folks! You can update services.js with the services you wish to monitor and throw the code up on a server. The full source code for the project can be found here.

Also, I'd encourage you to try implementing an uptime monitor using serverless technologies such as AWS Lambda or Auth0's Webtasks, or whatever your preferred stack is!