IOT Protocols for CI monitoring

I have been wanting to work on this idea about using IOT protocols like MQTT/AMQP for CI event monitoring and self discovery mainly because of it’s beautiful pubsub and lightweight nature based on WebSockets. Now that I was able to explore and integrate Terraform.IO with ServiceNow for Cloud CI provisioning, this came in as a next logical step and I have spent some time to actually implement a POC for this concept.

Before you read further, please bear in mind that this is just a work of POC and I have taken quite a few liberties to prove the same while I was aware of some and unaware of rest of the consequences and constraints we may face when we actually choose to implement this in Enterprise. The point I want to make here is that - ‘this’ is possible and I am open to constructive challenges and collaborations. I would divide this topic in below parts.

  1. Introduction to AMQP/MQTT
  2. Cloud CI Self-Discovery
  3. Proactive Event Management
  4. Next Steps and Possibilities (separate upcoming post)

In order to limit the length of this blog post, I would try to give enough pointers and links to pique your interest to external resources and not go into details of explaining the same. However, below diagram is your friend for quick, superficial understanding.

ProActive CMDB

1. Introduction to AMQP/MQTT

AMQP (Advanced Message Queuing Protocol) and MQTT (Message Queuing Telemetry Transport) - as the name suggest - are messaging protocols which are used for machine-to-machine communication especially in the field of IOT. Working on IOT projects, involves working with a lot of sensors and devices which generate lot of data to be sent to the central system for analysis purposes. These protocols are very light weight and on-demand in nature - meaning you don’t have to keep polling these devices for information, rather the devices can trigger and send the relevant messages by themselves when required. At the same time the message sizes are very short and can be counted in bytes instead of kilobytes. There are several open source message brokers (Ex.: RabbitMQ, Mosquitto, Mosca etc.) which support these protocols and manage various ‘channels’ on which devices can publish the messages or subscribe to a channel to get the published messages. Reasons to choose these protocols are summarized below:-

To get an idea of the amount of data being saved - check out this MQTT RBE vs. Poll/Response Calculator.

Referring to the diagram - Host for message broker is represented by the dotted block titled “MQTT/AMQP”. Within this block, the Green block indicates the environment required to run broker - represented by Sky Blue.

2. Cloud CI Self-Discovery

In order to connect the clients i.e. CIs’ we need to install client - yes this is agent based, but that’s not a memory or processor intensive client. Infact, it is a simple script which connects to the broker and monitors the system health locally. It can be written in any language of choice - we have used NodeJS. For example, if you are using RabbitMQ as your broker, here are they list of languanges in which you can develop your clients. Well, for those who are still trying to weigh the pros and cons of agentless and agent based discovery, I may be able to contribute to your calculations by pointing you to the advantages of MQTT/AMQP protocol in previous part. One of the key advantages of agent based approach is that it enables self-discovery of new cloud CIs’ being provisioned. Terraform along with some cloud providers themselves, by virtue, have the ability to run post provisioner scripts which help you setup the basic environment and tools of your choice once the CI is provisioned successfully. The client can be installed in the same way and it can send an up event to the broker - which can inturn be queued in ServiceNow and then trigger the Discovery of these cloud CIs’. This queuing of up events help validate and maintain purity of CMDB populated by Discovery.

Referring to the image, the dotted block in the middle represents various cloud IAAS platforms. Green blocks represent VMs provisioned, and each VMs has Sky Blue colored clients installed on them which communicate to broker.

3. ProActive Event Management

Continuing on the same notion of agent based monitoring - the agents can be programmed to monitor system events locally, for example - fatal shutdowns, critical process failures, memory and CPU thresholds etc. These agents do a dual job of publishing appropriate messages to appropriate channels as well as listening (subscribed) to required channels. In this scenario, there is no need for any 3rd party system to keep asking ALL the provisioned cloud CIs’ “if they are okay?”. Polling based approach has 2 drawbacks:

Both these issues are addressed by the approach we have discussed till now. The event monitoring aspect thus becomes proactive in nature. ServiceNow platform doesn’t support AMQP/MQTT protocols. Due to which there is a need to convert AMQP/MQTT messages received by broker into REST Messages to be sent to ServiceNow for consumption. Referring to the diagram, Green block named NOW Agent does this job. This agent, also has a client installed on it, and it subcribes to all the channels of the broker and it interfaces with ServiceNow to send messages of below types:-

… which can then be further processed by ServiceNow to trigger appropriate ITSM process.

Thank you if you have made it till here. As per me, this type of implementation opens a world of opportunites - couple of which I will be presenting in the next post - yes, I am that cheesy.

Would like to thank these gems who are working as a team on this - Nachiket, Subodh, Madhuri and Sanchita.