So who likes to travel? If we were to play Family Feud and name something that people like to do, I’m certain “travel” would be one of the top answers. But how many of us like waiting to board a plane? How many of us like delays and spend unnecessary hours at the airport? Right. I didn’t think so. So when I read in the news about the Toronto Pearson International Airport having a computer outage that led to significant delays, I can imagine how frustrating it would have been. Being a Torontonian, I know our airport does not hold the title of being the busiest airport in the world. Nonetheless, it was ranked (albeit 38th) in 2011. Part of the news article said “technicians are not sure what caused the problem”, which is a scary thought. Unfortunately, this is not an isolated incident that only happens at Nav Canada; the company behind traffic control at the airport.
Root cause analysis is one of the holy grails in IT management. If you are a system administrator who can pinpoint exactly why an outage happens, not only will you look like a superstar, your users/customers will love you for it. How can you achieve that with up.time? First of all, you need to have a unified dashboard so you can see things as they happen. But just as important, is getting alerts to the right person at the right time. But once you get the alert, what’s next? You need to be able to monitor complex business services.
There are two key points to consider:
- First, you must have coverage for all the underlying components that make up your business services. Whether it is OS, applications, or network and network devices, you must have visibility to everything in your infrastructure.
- Second, you need to be able to tie all the different components into your business services so that you can see the overall health of your services and exactly which component(s) is down.
The latter is vital if you want to perform root cause analysis. Having a tool (like up.time) that facilitates root cause analysis will make you the superstar (that you are), save you time in troubleshooting issues in your environment and get to the root cause of any outages with ease! If you haven’t tried out up.time in your environment, you need to download it and take it for a spin!