Tuesday, March 5, 2013

Sharing monitoring status with your internal and external users

The purpose of a monitoring tool like ServersCheck is monitor and alert you when infrastructure, network, server or application issues are detected so that you can quickly resolve them.

Users start calling you when they experience an issue because you are the expert in your organization in that field.  With a good monitoring tool in place, you already know that there is a problem.  Rather than answering the phone calls of users, you want to focus on the issue at hand and resolve it.

A way to address that communication issue is by sharing the monitoring status in a user friendly way with your internal and / or external user base.

Often system administrators look at ways to have a multi-user monitoring platform so that read-only users can directly connect to the system see the status.   There are 2 major downsides of such a strategy:
1) first of all it is complex to understand for non technical users so giving them access to a monitoring platform won't really help them.
2) second issue is a more critical one: security.  Having different login credentials to a critical platform like a monitoring software is bad security.  Users might be tempted to try on different username / password combinations - who knows maybe it works.  It also opens up the platform to the outside world.

Network Monitoring tools like ServersCheck address that issue by following a double approach:
1) make data available in an user friendly way so that non technical users can understand where the problem is
2) make it available in a location that for them is easy to understand.

Let's illustrate above with the way we deployed it at ServersCheck.  As part of our monitoring software and sensors we offer a SMS alerting service that allows devices to connect to our systems and send out SMS alerts to users.  We monitor our network & system's availability and performance through our Monitoring Software running on a dedicated server with WLAN access to our different offices.  The data relevant to our user community is updated every minute on a dedicated website:   We link from every page on our websites to the status page so that a user can immediately check if one of our online properties has a service issue.  When a user experiences a problem then he can immediately check the status page.  While he looks at it, he knows we know and that we are working on it.

