The Big Picture in IT Systems Management

Friday, June 27, 2008

Can you create corporate culture?

The original intent of this blog entry was to praise the wonderful and talented staff that we have here at uptime software, at least until I mentioned it to a staff member. I was promptly told that the smart ones are going to think I'm simply puffing them up and will quickly see through the ruse. It's been a little weird to stand up at a company townhall and say we're great ("uptime rocks!") as a motivator and then have all the staff look at you and declare "yeah, we know, we're here every day?" This brings up company culture and the difference between "saying it, and living it" (show, don't tell). Whenever new-hires come on board, within the first few days, there invariably is a comment like "wow, you guys are really happy here." This inclusive culture extends to our tele-commuters as well.

Something else that's really good for us is that referrals from the tech community started to happen a few years ago. We're pulling great talent from other local software businesses. People like it here because we are a performance oriented organization and smart work (not necessarily long hours) gets recognized. Opinions and dissent are encouraged, as everybody has a very clear picture of what kind of business we want to be. At our operations meetings, if everything seems to be going okay and everybody agrees, then we know that something is wrong or we're not challenging ourselves enough. A while ago, our development manager was telling one of his staff about voicing opposition to "stupid ideas," and how it took our dev manager a long time to realize that telling his boss (me at the time) he had a stupid idea was perfectly acceptable.

So, we're hiring and our open-requisition list is getting longer. Come work for a great company that's in the hippest area of downtown Toronto (King West). We have tonnes of restaurants (Thai, Indian, Lebanese, Korean, Japanese, Fusion, French, ...) within three blocks of the office. We also have great bars, patios, galleries, baseball, soccer.

Contact hr-careers@uptimesoftware.com!

Okay, I lied, this blog entry is to praise our staff... To our fantastic employees, you guys rock.

Labels:




Thursday, June 26, 2008

Lament for splunk

Splunk is a partner of ours and up.time integrates with splunk to assist with forensic problem resolution. Michael Baum, their CEO, just published this blog entry - "Ode to Log Management."

I hope it's not the beginning of the end.



Interesting customers

On a regular basis, I try and talk to customers about up.time and what their likes and dislikes are. I recently had the fortune to talk to Ben Rockwood, of Cuddletech blogging fame. Ben is about as hard-core Solaris as they come, and if you read his blog you'll garner lots of interesting snippets about everything Solaris related. As it turns out, Ben is also working at Joyent, which is a customer of ours and a large Sun user. Specifically, they use a lot of Sun with Solaris containers, something which plays well into our hetero-virtualization strength.

We had an interesting conversation, and it took some very interesting zigs and zags; ranging from product features, to agent vs. agentless monitoring, to how desperately he tried to get rid of up.time when he first started at Joyent.

First, how about the getting rid of us story... When Ben initially joined Joyent he couldn't believe that they had bought a commercial tool for performance and availability management, why weren't they using open source? As it turns out, every time that he needed forensic data to understand why outages had occurred, or why performance was suffering, he discovered that up.time had already recorded all the necessary low level performance stats required. There wasn't any need to go back and script the necessary commands, the data was already in up.time's performance datawarehouse. He finally conceded that up.time knew what you wanted, even before you knew you needed it. So, we got to stay!

We had further discussions on other system's management vendors and their techniques used for gathering data. Both he and I concurred that in order to get the necessary low-level metrics for both planning and forensic problem diagnosis you need agents on the systems monitored. He is the first guy that I've talked to that also agrees that Net-SNMP is not an agentless solution!

Ben had a legitimate critique of our service monitor extensibility, as our XML document definition for defining the data is a little cumbersome (and yes, we're working on it). However, he really liked the fact that our extensibility supports the concept of Arrays of data (or ranged data, as well call in internally). Almost all tools (open source and commercial) are extensible but only for atomic data (e.g. one integer, one string, etc.). With up.time, you can define a service monitor that understands an Array of data. So, for example, let's say that you want a count of users per Solaris zone (and want to record this over time). You can do this in an Array, e.g.

Zone NumUsers
zone1 2
zone2 5
zone3 6

The up.time monitor will take in this data and then you can graph the data on a line chart with the actual zone names and meaningful titles.

We also talked about support for dtrace, especially since it's now being supported on platforms other than Solaris. Our current extensibility is the solid groundwork for supporting output from dtrace scripts. We're currently looking at supporting dtrace through our data gathering mechanisms and it's only going to add to the arsenal of being able to quickly diagnose performance problems.

In a parting comment Ben mentioned that "uptime software gets it." They came from a system's administrator/management background and have continued to make up.time usable to the very people that need to keep infrastructure running.

Now, the conversation wasn't a total love-fest, Ben did have some good critiques, and it's people like him that are helping set the bar higher for us.

It's great to have customers that like using your product but whom are always pushing you to do better.

Labels:




Wednesday, June 4, 2008

Latest release

I'm quite pleased to announce that we've released our latest version of up.time. This latest release includes a number of great new things including: extensive VMware, pSeries micropartion (AIX), and Solaris virtualization capabilities; a whole new Service Level Agreement (SLA) solution; and some nice end-user transaction monitoring capabilities.

The product is more scalable than ever - we can now effectively monitor 5,000 systems in a single monitoring loop, and this will scale even further with our multi-data center release that is coming soon. We've also cleaned up and simplified the user interface to make it even easier to navigate through the product to get the necessary data with as few clicks as possible.

Check it out: http://www.uptimesoftware.com/overview.php