A former Microsoft executive referenced watermelon KPIs the other day on social media. It’s been a long time since I’ve heard that term, but I see examples of watermelon KPIs frequently. Let’s talk about what watermelon KPIs are, how to recognize them, and why they are bad, but people still love them.
A watermelon KPI is a statistic or metric that looks good, but upon further examination, is actually bad. Like a watermelon, these KPIs are green on the outside, but red on the inside.
Why watermelon KPIs are tempting
IT professionals love watermelon KPIs. They are a form of spin. Watermelon KPIs allow you to obscure problems, or make a bad problem look good.
Having worked in the security vendor and reseller space since 2016, I see watermelon KPIs all the time. It starts with an innocent enough request. A prospect comes to me and says they need custom metrics, because they have specific KPIs that they’ve found work for them. I show them the metrics I use and they have a myriad of reasons to reject them.
Inevitably, when I dig in, what I find is these custom KPIs aren’t working. They are obscuring a problem, but by making bad look mediocre or even good, the organization has some kind of an uneasy truce. They’ve convinced themselves that their real problem is that the data is stale, and if they could produce this data in an automated fashion, it would solve their problems.
But what I find, in my experience, is that outsourcing or automating the process to make these custom KPIs faster and cheaper never results in the expected improvements. That’s because it doesn’t address root cause. Watermelon KPIs mask the root cause. You need to find the KPIs that highlight the root cause so you can address it.
A hypothetical example of a watermelon KPI: wealth mass
Let’s step away from IT for a minute. Let’s say I run a business. The business has multiple locations. And I have a specific, custom KPI that I use to determine locations that have hidden upside. My KPI is wealth mass. It’s the total annual income of a zip code. But don’t tell anyone, this is a trade secret.
What I have discovered is that the Missouri ZIP codes of 63124 and 63125 are 14 miles away from each other, and both part of the St Louis metropolitan area. And they have nearly identical wealth mass. But the rent in 63125 is significantly lower than it is in 63124. So 63125 is a hidden bargain. Exploiting that is going to make me rich, rich, rich.
The wealth mass is very similar. The difference is literally a rounding error. So if I’ve doubled down on this metric, I’ve convinced myself that a business that does well in one of those zip codes will do equally well in the other.
This KPI is more tempting for a business that has an existing location in 63124 and is looking to expand into 63125. It looks like a bargain opportunity. Expansion from 63125 into 63124 is a bit less attractive because the rent is higher. The overhead will be higher to expand into 63124. But depending on the business, the numbers may still work if the margins are high enough to absorb the higher rent.
What’s even better is that these two ZIP codes are 14 miles apart, but not directly connected by any major highways. Effectively they are 30 to 45 minutes apart, depending on traffic, so the two locations won’t compete with each other.
That’s green on the outside. Let’s dig into why this KPI is a watermelon.
Red on the inside
A business that tries to expand from one of these ZIP codes into the other is going to have a hard time. And you don’t have to dig very far to see why. The total income of these two ZIP codes is nearly identical. But here’s the catch. The population of 63124 is 25% that of 63125. Conversely, the annual income of 63125 is 25% of 63124.
If I’m an operator of grocery store, I can operate a grocery store in either of those ZIP codes, but there are any number of products that are going to sell well in one location and poorly in another. A $3 bottle of wine is offputting in 63124, and a $100 bottle of wine will sit much longer in 63125 than it will in 63124. For a manager of an underperforming liquor department in that grocery store, wealth mass is a watermelon KPI.
A store that sells used appliances that does well in 63125 would struggle in 63124, although they could use this KPI to convince themselves that a turnaround is right around the corner. An estate planning service will have the opposite problem, doing very well in 63124, but not having as many clients in 63125 as this watermelon KPI of wealth mass might suggest.
Business operations would see right through this KPI and tell you that if you rely on this secret of the universe you discovered, you deserve what you get. But IT departments use KPIs like this all the time.
A real world example of a watermelon KPI in IT
I won’t say his name or who he works for, but I ran across an IT director who thought he was a high performer. The system he was in charge of had a 95% uptime. And he would brag about this to anyone who listened.
Industry standard for uptime is somewhere between 99.99% and 99.999%. if you hear the phrase six sigma, 6 sigma is approximately 99.997%. It’s a popular standard because it’s significantly cheaper to achieve than 99.999% while being very nearly as good. Think Bentley vs Rolls-Royce. 99.99% is a bigger step down than it sounds, but won’t raise significant questions about your success. Think of 99.99% as a Buick.
But 95% sounds good, on its surface. Get a 95% on everything in school, and you’re a candidate for valedictorian.
But 95% uptime means 5% downtime. It means that 1 hour out of 20, the system is down. So you’re looking at the system having 72 minutes of downtime every day, on average. If you can choose which 72 minutes, that might not be a huge problem. But it’s a very big problem if you don’t get to choose which hour. The system being down at 2:00 a.m. when most people are asleep is acceptable. But 72 minutes of downtime at 2:00 p.m., in the middle of business hours, costs a lot of money.
Another watermelon KPI I hear frequently is an assertion that a team deploys 99% of their updates to 99% of their systems with a 99% success rate. They may even hit the deadline 99% of the time.
That works out to an aggregate 97% success rate. Not the implied 99% rate. It may still be good, if it’s verifiable and turns out to be true.
A former client of mine, who worked in military intelligence near the end of the Cold War era told me that tactic sounded really familiar. He said that Soviet officers liked to use numbers like that. It was high enough to sound really good, but by their own admission wasn’t perfect. If someone knows about a case where you missed, you can deflect. Oh yes, that’s the 1% where I missed.
If your actual success rates are in the 90s, your total success rate is not 99%, but the aggregate may still be acceptable. So that series of 99% may or may not be a watermelon if it’s true. But if you are actually falling short of 99%, and you’re just relying on the difficulty of proving the actual numbers, it’s very much a watermelon.
Why IT departments use watermelon KPIs
I don’t think it’s coincidental that both IT departments and Soviets use/used watermelon KPIs. The conditions they have in common have some frightening similarities.
We’re talking high pressure, high stakes situations. One mistake in either place can end your career. The mistake may not even be your fault. The major difference is whether that mistake also ends your life in addition to your career. Legally, the IT department in the United States can’t kill you.
Both of them also deal with austerity. I’m sure you’ve heard the story of how NASA spent millions of dollars to develop a pen that can work in zero gravity. The Soviets used a pencil. Politicians and right-wing preachers love to repeat that story. The difference was, NASA saw carbon dust flying around as a problem that would cause critical systems to malfunction on rare occasions, but with dire consequences. The Soviets accepted the risk.
IT departments operate with similar budgets and staffing levels to what they had in the year 2000, even though the size of their networks has grown significantly in those 20 years. If IT seems like a harder job than it was 20 years ago, you’re not imagining things. I figured out how to solve a problem in 2005 that an analyst at a certain very expensive consulting firm says is impossible to solve now. I found that statement strange, because I solved the problem. But when I studied the problem, I realized we’re probably both right. I could solve the problem under 2005 era conditions. But today, people are expected to do what I did at 20, 50, or even 100 times the scale that I did. And sometimes with worse technology than I had, because the worse technology is cheaper.
When you can’t afford enough people or the right technology to solve a problem, you resort to watermelon KPIs as a form of self-preservation. But it doesn’t address the root cause, and that’s why you can have KPIs that suggest better times are right around the corner, but you never quite reach the corner, like Zeno’s Paradox of motion, where you’re infinitely close to your destination and each step brings you slightly closer, but you never reach it.