Discover more from The Thought Drop
The alerting tool ecosystem is so far behind it hurts – and I want to fix it.
"Why hasn't PagerDuty done this?"
The first few years of FireHydrant were a constant barrage of this question. Investors, friends, and prospective customers asked it in their own way, and I responded the same way each time: "Why haven't they done anything?"
My response, admittedly, is cheeky. But universally, it garnered the same reaction: "That's a fair point."
As the CEO of FireHydrant, an incident management tool, it's my job to have answers to these types of questions about other businesses in the same vicinity as ours. But that one has remained the most elusive to me. We've been building an incredible incident management tool for almost five years, and 95% of our customers have already integrated an alerting tool. Which meant the question turned towards us:
"Why aren't you building alerting?"
And then I started replying with the same thing: "That's a fair point." So today, we're opening up the waitlist for a modern replacement for PagerDuty or any other alerting tool you can think of.
It's called Signals.
The problem with alerting tools today
I see four main problems with the alerting ecosystem as it stands today, and I've spent time with customers and industry experts validating them all. Here's what I've learned:
Alerting as a standalone tool is "crazy fucking expensive," and no one can tell you why. People have reached their boiling point, and current macroeconomic conditions are drawing even more scrutiny to the price of alerting tools. CFOs are not happy campers right now.
Scheduling and substitutions are painful, and it hasn't materially changed for years. It took several years even to get simple round-robin logic on schedules. Moreover, you can't do apparent tasks, such as temporary coverage to walk the dog or pick up the kids inside Slack, either.
The service directory is a sham, and teams mean nothing. Every business I spoke to inevitably griped about their configuration: "We just use the service directory to represent teams because there's no other way to page them otherwise." – This is asinine, and means service ownership isn't genuinely possible without ugly hacks.
Alerts and incidents are the same, meaning people get screwed regarding noisy monitors. This also prevents getting valuable insights such as alert noise ratio and accurate mean time to detect metrics.
This list is far from complete, and several issues with current providers are unnamed for brevity. I know I have many of my own as a recovering on-call engineer, too.
But the feedback was too consistent to ignore, and we decided to do something about it this year.
Alerting and incident response belong together.
First, alerting is a must-have for most businesses building software in 2023. But what isn't a must-have is paying for hundreds (or thousands) of seats that are never utilized for sometimes several months each year.
So we're resetting the pricing standard: Active user invoicing. You'll only pay for the notified users in any given month when you use Signals for alerting. Signals is also an add-on to our core incident management platform, meaning you get all of the power of modern incident management, alerting, status pages, and retrospectives for less than the average per seat on PagerDuty. This also assumes you're notifying 90%+ of your entire on-call rotation – a high percentage for most businesses.
Next up: Configuration. FireHydrant Signals leverages our existing service catalog to scope escalation policies, signal rules, and schedules to a team. You need to notify teams, not services, about incidents. It also means you can quickly ask at any given point, "Who is on call for Team X?" and page them immediately. Additionally, since services have ownership in FireHydrant already, you get the same backward compatibility with whichever tool you use. And, yes, our Terraform provider will be updated on day one.
Finally, In Signals, there's a clear separation between an incoming Signal, an outgoing alert notification, and a declared incident. The benefit? Clear-as-day analytics to determine which teams receive the most alerts and which ones have the best signal-to-noise ratio. This allows teams to have data-backed discussions about which alerts they need and which they can drop entirely. The result means more sleep, happier on-call teams, and faster assembly time.
Signals is also built from the ground up by FireHydrant. We're not acquiring or merging with a company to be able to claim that we now do alerting overnight. The reality is that no company has solved it the way we think it should be, necessitating a greenfield build. Spoiler: It's badass.
Signals is being released this Winter to everyone. Still, if you want it earlier, head to www.firehydrant.com/signals to sign up for our waitlist and get access to the beta when it launches. We'll also release technical details for several months about how Signals is built, considerations we've made, and why it's kickass software solving a real problem.
I'm excited to show everyone what we've built soon.