The up.time IT Systems Management Blog

Archive for the ‘IT Management’ Category

CTOEdge Article: SLA Tips and Tricks

Tuesday, February 1st, 2011

Had a great little post in CTOEdge about SLA Tips & Tricks: http://www.ctoedge.com/content/sla-tips-and-tricks.

Alex

Why Freeware Just Doesn’t Cut it for the Mid-Enterprise

Monday, September 13th, 2010

Time for ‘the Ugly Truth’. Freeware just isn’t good enough sometimes.

Here’s 5 of the main characteristics of Freeware tools that make them unsuitable for the mid-enterprise:

  1. Hard to configure out of the box - You want a solution that is intuitive, with a clean interface, that doesn’t require massive amounts of scripting or customization to get started. You definitely wouldn’t want a solution that describes itself as “tricky to configure out of the box – even when you’ve got a good grasp of what’s going on”. Every interface in a good solution should guide you towards the best practices that will save you time and get your project rolling as quickly as possible.
  2. Extremely cumbersome to maintain and operate – You want a solution that’s well thought out, that doesn’t use conf files to keep lists of devices or massive lists of alerts. You want a system that uses rules, that minimizes the number of full time staff hours to operate, and most of all is easy to learn so that all of your staff can have the ability to work with the monitoring solution.
  3. Requires massive customization to achieve results – You want a solution that allows you to monitor your infrastructure right out of the box, with a wide variety of available monitoring capabilities for various heterogenous platforms, and that has the ability to monitor all of your common infrastructure stack elements. You definitely don’t want to learn a whole bunch of scripting to do something basic like webservice, ftp, or database monitoring.
  4. No commercial support – You don’t want to be sifting through knowledgebases, mailing lists, and forums every time you find something that doesn’t seem to make sense when you use a product. You want to be able to pick up the phone, email someone and have experts on the product guide you to a suitable resolution. You need to have this because monitoring is an essential service, the last thing you need is to “wait for someone else who might have had this problem” to reply to your post on a public forum. You definitely don’t want the whole development and support organization to be “one guy”, who “can’t respond to emails directly”. That’s just a supportability nightmare for your selected solution.
  5. No scaleable architecture – As you continue to grow, all the problems above amplify themselves, but more importantly your infrastructure will grow across disparate geographic locations, and freeware tools just don’t have the kind of distributed archticture as per up.time’s Multi Data Center (MDC) functionality to cope with the needs of multi-site reporting and collection. You need to be able to scale across multiple sites, intelligently and efficiently and manage everything from a single unified console.

The result of the above 5 points is that organizations typically experiment with Freeware tools initially, until they realize that the TCO (Total Cost of Ownership) due to man hours and massive maintenance required to keep their systems going just doesn’t make any sense. This is when the “aha moment” happens and people decide it’s time to graduate to a more robust tool.

Heck, don’t just take it from me, read this stuff directly from the “getting started guide for beginners” from one of the websites of a freeware tool (emphasis added by me). You’ll instantly see all the warning signs that this may not be what you wanted to sign up for.

Here are some important things to keep in mind for first-time Nag*** users:

  1. Relax – it’s going to take some time. Don’t expect to be able to get things working exactly the way you want them right off the bat. it’s not that easy.
  2. Use the quickstart instructions. The quickstart installation guide is designed to get most new users up and running with a basic Nag*** setup fairly quickly. Within 20 minutes you can have Nag*** installed and monitoring your local system. Once that’s complete, you can move on to learning how to configure Nag*** to do more.
  3. Read the documentation. Nag*** can be tricky to configure when you’ve got a good grasp of what’s going on, and nearly impossible if you don’t. Make sure you read the documentation (particularly the sections on “Configuring Nag***” and “The Basics”). Save the advanced topics for when you’ve got a good understanding of the basics.
  4. Seek the help of others. If you’ve read the documentation, reviewed the sample config files, and are still having problems, send an email message describing your problems to the nag***-users mailing list. Due to the amount of work that I have to do for this project, I am unable to answer most of the questions that get sent directly to me, so your best source of help is going to be the mailing list. If you’ve done some background reading and you provide a good problem description, odds are that someone will give you some pointers on getting things working properly.

So you could download a freeware tool and “Relax” because it’s going to “take some time”, or you can download up.time and relax because it’s going to be easier than you thought. The choice is yours.

Don’t worry it’s not like “choose” your own adventure, in the end, whichever way you decide, in the end, up.time always will be the right choice.

Living in the Clouds, The Myth, The Reality

Monday, July 12th, 2010

So the question of the day –  is “the cloud” as an infrastructure alternative becoming more of a reality, or still just a Myth?

A Quick Status Check:

  • We are seeing vendors continue to consolidate their efforts to standardize cloud service offerings and provide new “cloud computing frameworks”. (Terremark, Savvis, Liquid Computing, VMforce  to name a very small handful)
  • We are seeing a cloud services and consultancy eco-system cropping up. (Ala ServiceMesh and Symplified to solve cloud identity management problems, CloudSwitch to solve cloud migrations and vendor management to name a tiny sampling)
  • It’s becoming clearer and clearer that virtualization is a major building block for “cloudifying” our operations, it’s just really not clear what level of virtualization we should be able to achieve in the data center. Nor is it clear how we should deal with all the processes required to reach these seemingly universally desired higher levels of virtualization to facilitate data with “private clouds”. (See Andi Mann’s interesting article on ‘VM Stall’)

So back to the original question – Myth or Reality?

My thought is – still a bit of both.

The idea that you ‘should’ achieve 80%-90% virtualization in the private data center, or that you can deliver anything close to 100% of your IT operations using cloud based services alone continues to be more of a myth than a reality.

Most clients I work with continue to  juggle their needs with respect to computing demand, data security, regulatory requirements and continuous systems manageability. All of this is being weighed across a diverse stack of private, MSP, and cloud service offerings.

Clients express the observation that every vendor is coming out of “the woodwork” to magically solve all of their “cloud” computing problems,  and they realize like you that they need to figure out how to combine several technologies and platforms together to create something unified that’s as unique as their business and technology needs are. Typically this leads to a giant systems management architecture diagram, that looks more like a patchwork quilt of disparate tools, than anything that is remotely manageable or sane. This is typically when I get involved to help our clients  start rationalizing their tool set with our product capabilities – either by  leveraging product capabilities to aggregate data from disparate data sources or to enable the complete removal of tools from their stack to simplify the overall architecture.

From this we quickly see the reality come into focus – there continues to be a need  a systems management tooling that encompasses your needs for presentation, correlation, consolidation, and detection across all physical, virtual and cloud based infrastructure. And we need this to be easy to license, deploy, manage and use.

If you are interested in seeing how you might potentially create this reality for yourself, feel free to join me on my next webinar that covers some of these topics “Simplifying Virtual, Physical and Cloud Monitoring”.


The “Real” Secret to IT Success

Monday, April 19th, 2010

So I swung over to Infoworld this morning over my cup of coffee and saw a headline that said “The real secret to IT success”. Wow! I thought – today I’m going to learn about fairies and unicorns! I couldn’t wait to learn the secret, so I excitedly clicked on the link….

Unfortunately at that point reality hit – Eric Knorr of Infoworld does an interview of Bob Lewis… of – you guessed it – Infoworld. Maybe Unicorns don’t exist after all.

Ok, so I’ll say, I am really not a fan of Bob Lewis. Not to be harsh on the guy, but he seems to really be promoting this whole “concept” that IT shouldn’t be thought of as a service. On the surface this seems like it could be plausible, dig a little deeper on the topic and you might start to feel like those theories are just picking at semantics. Want a sample of this stuff? Head over and read his article on the whole “concept” here (Run IT as a business — why that’s a train wreck waiting to happen).

So, is this blog post going to be totally about the merit of Bob Lewis’s theories? No… because in Bob’s article about the “The Real Secret to IT Success” there’s pretty much only one idea I really agree with –  and that’s his assertion that regardless of any process, the number one barrier to success in IT is a lack of communication and a lack of trust across silos.

Bob actually says it best himself:

“I know this is going to sound like I’m channeling Dr. Phil, but it’s still the right answer: In spite of all the panaceas out there — ITIL, COBIT, CMMI, and so on — relationships and trust come first. Without positive relationships and trust among participants, no process can work, all governance will be ineffective, and even the best employees will be hamstrung — tied up in conflict, bureaucracy, and rework”

Not sure why Bob thought he was channeling Dr. Phil… but he was channelling a whole lot of common sense.

This is where I totally diverge from Bob Lewis, I believe the conversation and trust comes from “visibility“. Having a toolset that allows your teams to quickly find the problem across the entire stack, and stop the internal finger pointing is the “key” to trust.

Just like a hockey team (sorry I’m Canadian), if your player’s on the ice are the best in their positions but can’t trust the other players to deliver, they won’t pass the puck.

In IT we do the same thing, when a call comes in we need to be able to pinpoint the area where the trouble is occuring. Let’s avoid the 30 minutes of all of our teams saying “Talk to network”, “talk to database”, “talk to firewall”, “talk to virtualization team”. Even worse, you have to tell the person who is calling to tell you that something is wrong with your infrastructure that you “don’t know what the problem is”, and that you have to call 5 different people in 5 sub departments to find out what’s going on.

Let your management solution tell you via alerts which team needs to be mobilized, with the data that the team needs to get the problem fixed ASAP.

On a longer term basis, you need reporting that accurately depicts the availability of the system, and clearly denotes which team is contributing the most to recorded outages against the app-stack. The point of this is not to promote a culture of “blame and shame”, it’s about having hard data to back up a real productive internal discussion about how to move forward. When the data is real, and everyone can agree on the source, it’s amazing how trust and pro-activity kick in across silos.

Make sure you have a systems management solution that eliminates the hand-waiving, make sure your teams can deliver systems that satisfy the business – without the finger pointing. Your management solution should be a catalyst for breaking through cross silo barriers and ensuring that no one team (database, firewall, network, or virtualization) becomes the new victim of the scapegoat syndrome we are all used to in this industry.

With uptime, always be sure your star players are ready to “pass the puck”.

Just in case your one of the rare few who actually want to RTFA – the original “Secret of IT Success” article at Infoworld is here.

Find Your Inner Fighter Pilot

Monday, April 12th, 2010

In systems management, we can learn alot from the mentality of a fighter pilot. What – you say, Ken’s been smoking the good stuff over the sunny Toronto weekend? What could a fighter pilot possibly have in common with someone in IT systems management?

A lot more than you think.

Think about it, what is a datacenter? A highly tuned combination of hardware and software designed to deliver services to the business. What is a Jet Fighter? A complex combination of millions of hardware components with a highly tuned set of software components designed to defend the pilot and provide the services nessecary for him to project his will at command. Wow, not so different?

So where have we gone wrong? What can we learn from the Jet Fighter Pilot? The difference is in approach. Just like the pilot and his cockpit we have huge arrays of data available to us through gauges, niche software, profiling tools, scripts… you name it we have it.  Guess what? When the pilot is in the heat of an engagement, he’s assessing his threats, he’s not sitting there fixating on a particular gauge. We need to stop fixating on niche tools, profilers and other specific metrics, we need something similar to a Pilot’s heads up display that will us to assess the biggest threats to the IT organization.

Worse, you’ve bought a tool that claims to do this, but rather than having a nice seamless HUD display or ”single pane of glass“, you have a “stainglass window” comprised of dozens of individual applications poorly duct-taped together.

Good thing uptime has a very specialized set of reporting capabilities to allow you to figure out where your major IT problem hotspots are, which infrastructure is suffering  infrequent downtimes, and where constant “5 minute problems” are sapping your team’s productivity.

All of the above issues ARE the major threat to IT, those are the things that make people wonder “Why aren’t we outsourcing this service? it NEVER works!”, this is the equivalent of having your jet fighter shot down.

Join me on one of our upcoming webinar series and find out how to unleash your inner fighter pilot.

Meeting customers and scaling

Monday, March 22nd, 2010

In case any of you who read Solution Architect Ken Cheung’s (aka Knailz) post last week didn’t read between the lines, he’s single and available.  So, if there’s anybody out there who’s into quiet confidence, motorcycles, and Nexus One’s, he’s your man.

Over the past few weeks, I’ve been able to spend some time talking with some of our banking customers and really getting a kick out of how they use up.time in their environments.  Each installation is at least 1,000 servers and spanning many different kinds of hardware platforms.  One of our trading floor customers uses up.time for resource planning, and over seventy line-of-business users regularly run reports for hundreds of systems spanning 1-3 months of time.  As you can well imagine, this kind of reporting can severely impact a data collection system – it’s the typical OLTP vs. data-warehouse workload tuning issue.

Fortunately, the way up.time is designed allows for massive scaling, and the various major components can be broken apart and scaled accordingly — they can also be run on different types of hardware platforms to suit particular purposes.  So, in the case of this customer, the data collector (or monitoring station) runs on a Solaris platform, the reporting engine runs on a Windows platform (raw CPU power for PDF generation), and they run UI instances on Linux platforms (we do this when large user communities use up.time).

The other thing I really like about these banking customers is ‘transparency’ – the reason they have large user communities – specifically the line-of-business users – use up.time is so that any business user can understand how their applications are working and how the various underlying components are functioning (well, they only care when they’re not functioning properly).  It’s refreshing to see  business users and IT operations share baseline data to improve application performance.

Alex

Fall in Love for All the Right Reasons

Monday, March 15th, 2010

That’s right, this morning I decided to bite the slogan of E-Harmony. What does E-Harmony have to do with systems management?

Ever notice that picking a software vendor to do a major software rollout is much like trying to find a soul mate? You have all the same processes, all the same risks, and all the same possibilities for financial ruin. (Like my positive view on dating? Guess why I’m still single – ‘technically’).

The Lure – Just like meeting someone, the initial attraction is what meets the eye, it’s all about the flashy sell features, the nice collateral and the teaser features that stick in your mind.

The Dance – The vendor demos, the RFI’s, the bakeoffs, the sweet talking. You decide to check out some of her friends to make sure they can vouch for her, referrals are always reassuring.

The Pot Commit – You’ve become the internal champion for the product, no matter what’s wrong with things, you’re going to make a determined internal sell. Despite all the recommendations and feedback from your friends and family – your heart is set, you’ve found the apple of your eye.

The Fireworks – After a tumultuous courtship you finally get to the good part, and you sign on the dotted line. You “Shake Hands”.

The Honeymoon – You’ve not just bought the 1$ of software, you’ve also made sure to get the 10$ of training, so the implementation kickoff is happening and everyone is in love with your vendor. A few great kickoff meetings and everyone is excited about the vision of the future. Everything smells like a bed of roses.

The Reality Check – 12-24 months later, no more bed of roses, your entire team wonders where the dream of the single pane of glass went, all those sexy features you once were so excited about bore you and don’t meet your needs. You have that gutt-wrenching feeling that maybe this wasn’t what you had signed up for.

Your wondering where all your dreams of server monitoring have gone, how much of yourself you’ve lost in the process….

Luckily you’re strong, you’ve got a good team, you pick yourself up assess the damage, and get ready for the next adventure. Guess it’s time to log back into E-harmony and find another vendor?

Don’t get caught up in the cycle above, ditch e-harmony vendor dating, download uptime – install it in 7 minutes, and see what a real-life, healthy, and comfortable Server Monitoring initiative really feels like.

It’s OK, if you’re not convinced, go play the field first, when you’re done, come meet your match. You’re heart (ehem career) will thank you.

CA buys Nimsoft, not such a good deal for everybody

Thursday, March 11th, 2010

Wow, big news yesterday!  CA acquired Nimsoft for $350MM to access the MSP/cloud markets and to blend them into their portfolio of cloud offerings.  This acquisition is interesting for many reasons, most notably for CA’s stance that they will use Nimsoft’s offerings to access the mid-market (or emerging enterprises in CA parlance).

Mid-enterprises (under $2 Billion in revenue) are struggling with how to monitor their physical, virtual and cloud resources with minimal budget and staff. Nimsoft has aggressively marketed itself to large enterprises as a ‘Big 4’ replacement and its price and complexity reflect this.  CA’s acquisition of Nimsoft pushes them further out of reach for mid-enterprises and we expect them to be even less competitive in this market moving forward.

If you are a mid-enterprise company here are a number of reasons  why this is not a good deal for you:

1.       Increased Risk: There is risk investing in a product that doesn’t have a clear future. We question if this product will even remotely resemble itself in 12 months.

2.       It’s now a CA Product: Companies are no longer buying Nimsoft, they are buying a CA product. Anyone looking to move away from a Big 4 framework because of cost, complexity or support reasons should take care.

3.       Focused on MSP: CA’s plans moving forward involve using Nimsoft for their MSP offering. If companies are looking at Nimsoft as an all-in-one solution, it looks like Nimsoft may be forced to focus on MSP by CA. If you aren’t an MSP, there might be reason for concern.

4.       Nimsoft isn’t a Mid-enterprise Product: They focus on ‘Big 4’ enterprise replacement, just visit their website to see their market positioing on that. Accordingly, their complexity and cost is closer to ‘Big 4′ than mid-enterprise. As a part of CA, Nimsoft will move even further out of reach of the mid-market.

5.       CA doesn’t Sell to the Mid-enterprise Market: Mid-enterprise solutions need to be complete, easy-to-use, value priced and have great support. You should be skeptical that CA can provide that, given their track record.

All in all, the excitement in the system’s management space continues to be interesting (and shrinking).

Alex

Devotion to Duty

Monday, February 22nd, 2010

Today’s xkcd comic was one that I got a real kick out of.  Picture John McLane as a sysadmin, and you get the picture.  The unstoppable reluctant hero, the right guy in the right place at the wrong time.  The relentless pursuit of availability and performance for the apps they support no matter the effort, that common thread amongst all great sysadmins worth their salt.  But at what cost to the admin and those around them does this come?  Well if they have subpar systems management software, at great cost.  A good toolkit of monitoring/management software and a few point tools for some vendor specific use cases will allow our protagonist to go from being the burnt out, run down admin to becoming the Dicky Fox of IT and jump each morning head first into whatever the world (or the Datacenter) can throw at them.  Systems Management software is to the sysadmin what spinach is to Popeye.  It’s going to give them what they need when the going gets tough.  With detailed drill down data and analytics traversing from Physical to Virtual environments and back becomes something that is done with ease. 

I’m a big fan of tools, my workshop has far more than my wife thinks any sane person should require.  There is a saying, “The right tool for the job”.  You wouldn’t try and screw in a Philips head screw with a Robertson driver (The Robertson, BTW is the possibly best screw head ever.  And a nice little Canadian invention.  Licensing issues kept the world from reaping the benefits of this beauty).  When picking the right tool for the job, you are balancing a few things.  Cost and capabilities being key.  You can buy a $30 screwdriver that only screws in one type of screw, or you can buy a set of screwdrivers for $30 and do all sorts of different screwing.  I’ll tell you though that the $30 single driver will probably never strip and will be able to drive screws until you lose it.  On the other hand, the $10 driver will probably do the trick as well, and provide you with a quality driver.  Where am I going with this?  The systems management space has all kinds of offerings that you can put into your toolbox.  There are expensive tools that do one thing and do it flawlessly.  There are cheap tools that can do a mountain of things, but they don’t excel at any one thing and you’ll end up outgrowing them as you become more proficient with your tools.  Then there are the sweet spot tools, the Rigid’s of the software world.  These tools that do exactly what you require, they do it well and you would be hard pressed to outgrow them.  This is where I feel that up.time fits into the systems management software space.  We’re not the cheap tool, but we’re not the overly expensive Tivoli or HPOV framework either.  We fit into that sweet spot where you are going to get pretty well everything you could ask for and be happy with what it cost you.

So do your sysadmins a favour and thank them by letting them trial up.time.  It will make their life easier and make the you, the IT manager, look like a hero as well with increased productivity and cost-savings. Even if you don’t go with a solution from us, when your sysadmins ask for tools, open your IT wallets for them at least a little.  Some IT spinach will go a long way to keeping the strength in the arms of your Datacenter Popeyes!

Why Real Winners Keep IT in Focus – A Lesson from Google.

Friday, February 12th, 2010

Ok. You’ve got me – I’m a Google fan. I believe in Google, I think they are better than sliced bread. I even imported a grey Market Google phone the day they released it and switched my carrier to make it happen. Wow, this fanboy is declaring his love for Google on his corporate blog post. End of story right?

Unfortunately not, over the past few weeks Google has gone on a flury of announcements, product launches, and new corporate adventures.  Let’s do a review:

1) Launching the Google Phone and attacking the consumer phone market with a new (for North America) direct sales model

2) Launching Google Buzz and attacking the social media market

3) Launching the new 1 Gigabit broadband internet project

So what’s my beef? I strongly believe that companies you can depend on and come to rely on in your daily operations and life need to keep their core focus. That is, good companies continue to concentrate on doing what they are good at. Losing your core focus to attack 3 of the most tumultuous markets – the Consumer Handset Market, the Social Media Sphere, and the Telecom Industry – seems like excessive risk. Why didn’t Google decide to focus on developing into these monsterous, barb ridden markets one at a time? Is their ambition bigger than their capability? Or are they naive with their newfound power and have already decided that they can take on any market?

So what’s the lesson? How does this relate to systems management?

Partnering with a solutions vendor for IT systems management that doesn’t have a core focus on the market is a mistake that I’ve seen played out time and time again. Practitioners get sucked into platforms designed by vendors who either have a vested interest in their own software stack, have a focus on markets that have nothing to do with the end user (MSP for instance), or are busy building functionality and features that just aren’t really related to managing the infrastructure but are more focused on vanity metrics.

A lack of focus also results in a fragmentation of the solution, as we see in many of the big 4 frameworks. Fragmentation is the careless tacking together and bolting together of 3rd party systems or intellectual property from acquisitions. This makes for very nice marketecture diagrams and collateral - but produces solutions that have clients scratching their heads 12 to 24 months after roll-out thinking “where did all my money go”?

Maybe Google is capable of attacking 3 massive markets at once and is capable of diverting their focus on multiple fronts, but IT systems management and monitoring vendors cannot.

When choosing a vendor, make sure you partner with a young agile company, that lives, breathes and does nothing but create a management solution that works. End of story.

Thanks Google, and good luck.