The Multi-Colored Brick Road to the Cloud with Rachel Dines
Transcript
Announcer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.
Corey: The company 0x4447 builds products to increase standardization and security in AWS organizations. They do this with automated pipelines that use well-structured projects to create secure, easy-to-maintain and fail-tolerant solutions, one of which is their VPN product built on top of the popular OpenVPN project which has no license restrictions; you are only limited by the network card in the instance. To learn more visit: snark.cloud/deployandgo
Corey: Couchbase Capella Database-as-a-Service is flexible, full-featured and fully managed with built in access via key-value, SQL, and full-text search. Flexible JSON documents aligned to your applications and workloads. Build faster with blazing fast in-memory performance and automated replication and scaling while reducing cost. Capella has the best price performance of any fully managed document database. Visit couchbase.com/screaminginthecloud to try Capella today for free and be up and running in three minutes with no credit card required. Couchbase Capella: make your data sing.
Corey: Welcome to Screaming in the Cloud. I’m Corey Quinn. A repeat guest joins me today, and instead of talking about where she works, instead we’re going to talk about how she got there. Rachel Dines is the Head of Product and Technical Marketing at Chronosphere. Rachel, thank you for joining me.
Rachel: Thanks, Corey. It’s great to be here again.
Corey: So, back in the early days of me getting started, well, I guess all this nonsense, I was an independent consultant working in the world of cloud cost management and you were over at CloudHealth, which was effectively the 800-pound gorilla in that space. I’ve gotten louder, and of course, that means noisier as well. You wound up going through the acquisition by VMware at CloudHealth, and now you’re over at Chronosphere. We’re going to get to all of that, but I’d rather start at the beginning, which, you know, when you’re telling stories seems like a reasonable place to start. Your first job out of school, to my understanding, was as an analyst at Forrester is that correct?
Rachel: It was yeah. Actually, I started as a research associate at Forrester and eventually became an analyst. But yes, it was Forrester. And when I was leaving school—you know, I studied art history and computer science, which is a great combination, makes a ton of sense—I can explain it another time—and I really wanted to go work at the equivalent of FAANG back then, which was just Google. I really wanted to go work at Google.
And I did the whole song-and-dance interview there and did not get the job. Best thing that’s ever happened to me because the next day a Forrester recruiter called. I didn’t know what Forrester was—once again, I was right out of college—I said, “This sounds kind of interesting. I’ll check it out.” Seven years later, I was a principal analyst covering, you know, cloud-to-cloud resiliency and backup to the cloud and cloud storage. And that was an amazing start to my career, that really, I’m credited a lot of the things I’ve learned and done since then on that start at Forrester.
Corey: Well, I’ll admit this: I was disturbingly far into my 30s before I started to realize what it is that Forrester and its endless brethren did. I’m almost certain you can tell that story better than I can, so what is it that Forrester does? What is its place in the ecosystem?
Rachel: Forrester is one of the two or three biggest industry analyst firms. So, the people that work there—the analysts there—are basically paid to be, like, big thinkers and strategists and analysts, right? There’s a reason it’s called that. And so the way that we spent all of our time was, you know, talking to interesting large, typically enterprise IT, and I was in the infrastructure and operations group, so I was speaking to infrastructure, ops, precursors to DevOps—DevOps wasn’t really a thing back in ye olden times, but we’re speaking to them and learning their best practices and publishing reports about the technology, the people and the process that they dealt with. And so you know, over a course of a year, I would talk to hundreds of different large enterprises, the infrastructure and ops leaders at everyone from, like, American Express to Johnson & Johnson to Monsanto, learn from them, write research and reports, and also do
things like inquiries and speaking engagements and that kind of stuff.
So, the idea of industry analysts is that they’re neutral, they’re objective. You can go to them for advice, and they can tell you, you know, these are the shortlist of vendors you should consider and this is what you should look for in a solution.
Corey: I love the idea of what that role is, but it took me a while as a condescending engineer to really wrap my head around it because I viewed it as oh, it’s just for a cover your ass exercise so that when a big company makes a decision, they don’t get yelled at later, and they said, “Well, it seemed like the right thing to do. You can’t blame us.” And that is an overwhelmingly cynical perspective. But the way it was explained to me, it really was put into context—of all things—by way of using the AWS bill as a lens. There’s a whole bunch of tools and scripts and whatnot on GitHub that will tell you different things about your AWS environment, and if I run them in my environment, yeah, they work super well.
I run them in a client environment and the thing explodes because it’s not designed to work at a scale of 10,000 instances in a single availability zone. It’s not designed to do backing off so it doesn’t exhaust rate limits across the board. It requires a rethinking at that scale. When you’re talking about enterprise-scale, a lot of the Twitter zeitgeist, as it were, about what tools work well and what tools don’t for various startups, they fail to cross over into the bowels of a regulated entity that has a bunch of other governance and management concerns that don’t really apply. So, there’s this idea of okay, now that we’re a large, going entity with serious revenue behind this, and migrating to any of these things is a substantial lift. What is the right answer? And that is sort of how I see the role of these companies in the ecosystem playing out. Is that directionally correct?
Rachel: I would definitely agree that that is directionally correct. And it was the direction that it was going when I was there at Forrester. And by the way, I’ve been gone from there for, I think, eight-plus years. So, you know, it’s definitely evolved it this space—
Corey: A lifetime in tech.
Rachel: Literally feels like a lifetime. Towards the end of my time there was when we were starting to get briefings from this bookstore company—you might have heard of them—um, Amazon?
Corey: Barnes and Noble.
Rachel: Yes. And Barnes and Noble. Yes. So, we’re starting to get briefings from Amazon, you know, about Amazon Web Services, and S3 had just been introduced. And I got really excited about Netflix and chaos engineering—this was 2012, right?—and so I did a bunch of research on chaos engineering and tried to figure out how it could apply to the enterprises.
And I would, like, bring it to Capital One, and they were like, “Ya crazy.” Turns out I think I was just a little bit ahead of my time, and I’m seeing a lot more of the industry analysts now today looking at like, “Okay, well, yeah, what is Uber doing? Like, what is Netflix doing?” And figure out how that can translate to the enterprise. And it’s not a one-to-one, right, just because the people and the structures and the process is so different, so the technology can’t just, like, make the leap on its own. But yes, I would definitely agree with that, but it hasn’t necessarily always been that way.
Corey: Oh, yeah. Like, these days, we’re seeing serverless adoption on some levels being driven by enterprises. I mean, Liberty Mutual is doing stuff there that is really at the avant-garde that startups are learning from. It’s really neat to see that being turned on its head because you always see these big enterprises saying, “We’re like a startup,” but you never see a startup saying, “We’re like a big enterprise.” Because that’s evocative of something that isn’t generally compelling.
“Well, what does that mean, exactly? You take forever to do expense reports, and then you get super finicky about it, and you have so much bureaucracy?” No, no, no, it’s, “Now, that we’re process bound, it’s that we understand data sovereignty and things like that.” But you didn’t stay there forever. You at some point decided, okay, talking to people who are working in this industry is all well and good, but time for you to go work in that industry yourself. And you went to, I believe, NetApp by way of Riverbed.
Rachel: Yes, yeah. So, I left Forrester and I went over to Riverbed to work on their cloud storage solution as a product marketing. And I had an amazing six months at Riverbed, but I happened to join, unfortunately, right around the time they were being taken private, and they ended up divesting their storage product line off to NetApp. And they divested some of their other product lines to some other companies as part of the whole deal going private. So, it was a short stint at Riverbed, although I’ve met some people that I’ve stayed in touch with and are still my friends, you know, many years later.
And so, yeah, ended up over at NetApp. And it wasn’t necessarily what I had initially planned for, but it was a really fun opportunity to take a cloud-integrated storage product—so it was an appliance that people put in their data centers; you could send backups to it, and it shipped those backups on the back end to S3 and then to Glacier when that came out—trying to make that successful in a company that was really not overly associated with cloud. That was a really fun process and a fun journey. And now I look at NetApp and where they are today, and they’ve acquired Spot and they’ve acquired CloudCheckr, and they’re, like, really going all-in in public cloud. And I like to think, like, “Hey, I was in the early days of that.” But yeah, so that was an interesting time in my life for multiple reasons.
Corey: Yeah, Spot was a fascinating product, and I was surprised to see it go to NetApp. It was one of those acquisitions that didn’t make a whole lot of sense to me at the time. NetApp has always been one of those companies I hold in relatively high regard. Back when I was coming up in the industry, a bit before the 2012s or so, it was routinely ranked as the number one tech employer on a whole bunch of surveys. And I don’t think these were the kinds of surveys you can just buy your way to the top of.
People who worked there seemed genuinely happy, the technology was fantastic, and it was, for example, the one use case in which I would run a database where its data store lived on a network file system. I kept whining at the EFS people over at AWS for years that well, EFS is great and all but it’s no NetApp. Then they released NetApps on tap on FSX as a first-party service, in which case, okay, thank you. You have now solved every last reservation I have around this. Onward.
And I still hold the system in high regard. But it has, on some level, seen an erosion. We’re no longer in a world where I am hurling big money—or medium money by enterprise standards—off to NetApp for their filers. It instead is something that the cloud providers are providing, and last time I checked, no matter how much I spend on AWS they wouldn’t let me shove a NetApp filer into us-east-1 without asking some very uncomfortable questions.
Rachel: Yeah. The whole storage industry is changing really quickly, and more of the traditional on-premises storage vendors have needed to adapt or… not, you know, be very successful. I think that NetApp’s done a nice job of adapting in recent years. But I’d been in storage and backup for my entire career at that point, and I was like, I need to get out. I’m done with storage. I’m done with backup. I’m done with disaster recovery. I had that time; I want to go try something totally new.
And that was how I ended up leaving NetApp and joining CloudHealth. Because I’d never really done the startup thing. I done a medium-sized company at Riverbed; I’d done a pretty big company at NetApp. I’ve always been an entrepreneur at heart. I started my first business on the playground in second grade, and it was reselling sticks of gum. Like, I would go use my allowance to buy a big pack of gum, and then I sold the sticks individually for ten cents apiece, making a killer margin. And it was a subscription, actually. [laugh].
Corey: Administrations generally—at least public schools—generally tend to turn a—have a dim view of those things, as I recall from my misspent youth.
Rachel: Yeah. I was shut down pretty quickly, but it was a brilliant business model. It was—so you had to join the club to even be able to buy into getting the sticks of gum. I was, you know, all over the subscription business [laugh] back then.
Corey: And area I want to explore here is you mentioned that you double-majored. One of those majors was computer science—art history was sort of set aside for the moment, it doesn’t really align with either direction here—then you served as a research associate turned analyst, and then you went into product marketing, which is an interesting direction to go in. Why’d you do it?
Rachel: You know, product marketing and industry analysts are there’s a lot of synergy; there’s a lot of things that are in common between those two. And in fact, when you see people moving back and forth from the analyst world to the vendor side, a lot of the time it is to product marketing or product management. I mean, product marketing, our whole job is to take really complex technical concepts and relate them back to business concepts and make them make sense of the broader world and tell a narrative around it. That’s a lot of what an analyst is doing too. So, you know, analysts are writing, they’re giving public talks, they’re coming up with big ideas; that’s what a great product marketer is doing also.
So, for me, that shift was actually very natural. And by the way, like, when I graduated from school, I knew I was never going to code for a living. I had learned all I was going to learn and I knew it wasn’t for me. Huge props, like, you know, all the people that do code for a living, I knew I couldn’t do it. I wasn’t cut out for it.
Corey: I found somewhat similar discoveries on my own journey. I can configure things for a living, it’s fun, but I still need to work with people, past a certain point. I know I’ve talked about this before on some of these shows, but for me, when starting out independently, I sort of assumed at some level, I was going to shut it down, and well, and then I’ll go back to being an SRE or managing an ops team. And it was only somewhat recently that I had the revelation that if everything that I’m building here collapses out from under me or gets acquired or whatnot and I have to go get a real job again, I’ll almost certainly be doing something in the marketing space as opposed to the engineering space. And that was an interesting adjustment to my self-image as I went through it.
Because I’ve built everything that I’ve been doing up until this point, aligned at… a certain level of technical delivery and building things as an engineer, admittedly a mediocre one. And it took me a fair bit of time to get, I guess, over the idea of myself in that context of, “Wow, you’re not really an engineer. Are you a tech worker?” Kind of. And I sort of find myself existing in the in-between spaces.
Did you have similar reticence when you went down the marketing path or was it something that you had, I guess, a more mature view of it [laugh] than I did and said, “Yeah, I see the value immediately,” whereas I had to basically be dragged there kicking and screaming?
Rachel: Well, first of all, Corey, congratulations for coming to terms with the fact that you are a marketer. I saw it in you from the minute I met you, and I think I’ve known you since before you were famous. That’s my claim to fame is that I knew you before you were famous. But for me personally, no, I didn’t actually have that stigma. But that does exist in this industry.
I mean, I think people are—think they look down on marketing as kind of like ugh, you know, “The product sells itself. The product markets itself. We don’t need that.” But when you’re on the inside, you know you can have an amazing product and if you don’t position it well and if you don’t message it well, it’s never going to succeed.
Corey: Our consulting [sub-projects 00:14:31] are basically if you bring us in, you will turn a profit on the engaging. We are selling what basically [unintelligible 00:14:37] money. It is one of the easiest ROI calculations. And it still requires a significant amount of work on positioning even on the sales process alone. There’s no such thing as an easy enterprise sale.
And you’re right, in fact, I think the first time we met, I was still running a DevOps team at a company and I was deploying the product that you were doing marketing for. And that was quite the experience. Honestly, it was one of the—please don’t take this the wrong way at all—but you were at CloudHealth at the time and the entire point was that it was effectively positioned in such a way of, right, this winds up solving a lot of the problems that we have in the AWS bill. And looking at how some of those things were working, it was this is an annoying, obnoxious problem that I wish I could pay to make someone else’s problem, just to make it go away. Well, that indirectly led to exactly where we are now.
And it’s really been an interesting ride, just seeing how that whole thing has evolved. How did you wind up finding yourself at CloudHealth? Because after VMware, you said it was time to go to a startup. And it’s interesting because I look at where you’ve been now, and CloudHealth itself gets dwarfed by VMware, which is sort of the exact opposite of a startup, due to the acquisition. But CloudHealth
was independent for years while you were there.
Rachel: Yeah, it was. I was at CloudHealth for about three-plus years before we were acquired. You know, how did I end up there? It’s… it’s all hazy. I was looking at a lot of startups, I was looking for, like, you know, a Series B company, about 50 people, I wanted something in the public cloud space, but not storage—if I could get away from storage that was the dream—and I met the folks from CloudHealth, and obviously, I hadn’t heard about—I didn’t know about cloud cost management or cloud governance or FinOps, like, none of those were things back then, but I was I just was really attracted to the vision of the founders.
The founders were, you know, Joe Kinsella and Dan Phillips and Dave Eicher, and I was like, “Hey, they’ve built startups before. They’ve got a great idea.” Joe had felt this pain when he was a customer of AWS in the early days, and so I was like—
Corey: As have we all.
Rachel: Right?
Corey: I don’t think you’ll find anyone in this space who hasn’t been a customer in that situation and realized just how painful and maddening the whole space is.
Rachel: Exactly, yeah. And he was an early customer back in, I think, 2014, 2015. So yeah, I met the team, I really believed in their vision, and I jumped in. And it was really amazing journey, and I got to build a pretty big team over time. By the time we were acquired a couple of years later, I think we were maybe three or 400 people. And actually, fun story. We were acquired the same week my son was born, so that was an exciting experience. A lot of change happened in my life all at once.
But during the time there, I got to, you know, work with some really, really cool large cloud-scale organizations. And that was during that time that I started to learn more about Kubernetes and Mesos at the time, and started on the journey that led me to where I am now. But that was one of the happiest accidents, similar to the happy accident of, like, how did I end up at Forrester? Well, I didn’t get the job at Google. [laugh]. How did I end up at CloudHealth? I got connected with the founders and their story was really inspiring.
Corey: Couchbase Capella Database-as-a-Service is flexible, full-featured and fully managed with built in access via key-value, SQL, and full-text search. Flexible JSON documents aligned to your applications and workloads. Build faster with blazing fast in-memory performance and automated replication and scaling while reducing cost. Capella has the best price performance of any fully managed document database. Visit couchbase.com/screaminginthecloud to try Capella today for free and be up and running in three minutes with no credit card required. Couchbase Capella: make your data sing.
Corey: It’s amusing to me the idea that, oh, you’re at NetApp if you want to go do something that is absolutely not storage. Great. So, you go work at CloudHealth. You’re like, “All right. Things are great.” Now, to take a big sip of scalding hot coffee and see just how big AWS billing data could possibly be. Yeah, oops, you’re a storage company all over again.
Some of our, honestly, our largest bills these days are RDS, Athena, and of course, S3 for all of the bills storage we wind up doing for our customers. And it is… it is not small. And that has become sort of an eye-opener for me just the fact that this is, on some level, a big data
problem.
Rachel: Yeah.
Corey: And how do you wind up even understanding all the data that lives in just the outputs of the billing system? Which I feel is sort of a good setup for the next question of after the acquisition, you stayed at VMware for a while and then matriculated out to where you are now where you’re the Head of Product and Technical Marketing at Chronosphere, which is in the observability space. How did you get there from cloud bills?
Rachel: Yeah. So, it all makes sense when I piece it together in my mind. So, when I was at CloudHealth, one of the big, big pain points I was seeing from a lot of our customers was the growth in their monitoring bills. Like, they would be like, “Okay, thanks. You helped us, you know, with our EC2 reservations, and we did right-sizing, and you help with this. But, like, can you help with our Datadog bill? Like, can you help with our New Relic bill?”
And that was becoming the next biggest line item for them. And in some cases, they were spending more on monitoring and APM and like, what we now call some things observability, they were spending more on that than they were on their public cloud, which is just bananas. So, I would see them making really kind of bizarre and sometimes they’d have to make choices that were really not the best choices. Like, “I guess we’re not going to monitor the lab anymore. We’re just going to uninstall the agents because we can’t pay this anymore.”
Corey: Going down from full observability into sampling. I remember that. The New Relic shuffle is what I believe we call it at the time. Let’s be clear, they have since fixed a lot of their pricing challenges, but it was the idea of great suddenly we’re doing a lot more staging environments, and they come knocking asking for more money but it’s a—I don’t need that level of visibility in the pre-prod environments, I guess. I hate doing it that way because then you have a divergence between pre-prod and actual prod. But it was economically just a challenge. Yeah, because again, when it comes to cloud, architecture and cost are really one and the same.
Rachel: Exactly. And it’s not so much that, like—sure, you know, you can fix the pricing model, but there’s still the underlying issue of it’s not black and white, right? My pre-prod data is not the same value as my prod data, so I shouldn’t have to treat it the same way, shouldn’t have to pay for it the same way. So, seeing that trend on the one hand, and then, on the other hand, 2017, 2018, I started working on the container cost allocation products at CloudHealth, and we were—you know, this was even before that, maybe 2017, we were arguing about, like, Mesos and Kubernetes and which one was going to be, and I got kind of—got very interested in that world.
And so once again, as I was getting to the point where I was ready to leave CloudHealth, I was like, okay, there’s two key things I’m seeing in the market. One is people need a change in their monitoring and observability; what they’re doing now isn’t working. And two, cloud-native is coming up, coming fast, and it’s going to really disrupt this market. So, I went looking for someone that was at the intersection of the two. And that’s when I met the team at Chronosphere, and just immediately hit it off with the founders in a similar way to where I hit it off with the founders that CloudHealth. At Chronosphere, the founders had felt pain—
Corey: Team is so important in these things.
Rachel: It’s really the only thing to me. Like, you spend so much time at work. You need to love who you work with. You need to love your—not love them, but, you know, you need to work with people that you enjoy working with and people that you learn from.
Corey: You don’t have to love all your coworkers, and at best you can get away with just being civil with them, but it’s so much nicer when you can have a productive, working relationship. And that is very far from we’re going to go hang out, have beers after work because that leads to a monoculture. But the ability to really enjoy the people that you work with is so important and I wish that more folks paid attention to that.
Rachel: Yeah, that’s so important to me. And so I met the team, the team was fantastic, just incredibly smart and dedicated people. And then the technology, it makes sense. We like to joke that we’re not just taking the box—the observability box—and writing Kubernetes in Crayon on the outside. It was built from the ground up for cloud-native, right?
So, it’s built for this speed, containers coming and going all the time, for the scale, just how much more metrics and observability data that containers emit, the interdependencies between all of your microservices and your containers, like, all of that stuff. When you combine it makes the older… let’s call them legacy. It’s crazy to call, like, some of these SaaS solutions legacy but they really are; they weren’t built for cloud-native, they were built for VMs and a more traditional cloud infrastructure, and they’re starting to fall over. So, that’s how I got involved. It’s actually, as we record, it’s my one-year anniversary at Chronosphere. Which is, it’s been a really wild year. We’ve grown a lot.
Corey: Congratulations. I usually celebrate those by having a surprise meeting with my boss and someone I’ve never met before from HR. They don’t offer your coffee. They have the manila envelope of doom in front of them and hold on, it’s going to be a wild meeting. But on the plus side, you get to leave work early today.
Rachel: So, good thing you run in your own business now, Corey.
Corey: Yeah, it’s way harder for me to wind up getting surprise-fired. I see it coming [laugh]—
Rachel: [laugh].
Corey: —aways away now, and it looks like an economic industry trend.
Rachel: [sigh]. Oh, man. Well, anyhow.
Corey: Selfishly, I have to ask. You spent a lot of time working in cloud cost, to a point where I learned an awful lot from you as I was exploring the space and learning as I went. And, on some level, for me at least, it’s become an aspect of my identity, for better or worse. What was it like for you to leave and go into an orthogonal space? And sure, there’s significant overlap, but it’s a very different problem aimed at different buyers, and honestly, I think it is a more exciting problem that you are in now, from a business strategic perspective because there’s a limited amount of what you can cut off that goes up theoretically to a hundred percent of the cloud bill. But getting better observability means you can accelerate your feature velocity and that turns into something rather significant rather quickly. But what was it like?
Rachel: It’s uncomfortable, for sure. And I tend to do this to myself. I get a little bit itchy the same way I wanted to get out of storage. It’s not because there’s anything wrong with storage; I just wanted to go try something different. I tend to, I guess, do this to myself every five years ago, I make a slightly orthogonal switch in the space that I’m in.
And I think it’s because I love learning something new. The jumping into something new and having the fresh eyes is so terrifying, but it’s also really fun. And so it was really hard to leave cloud cost management. I mean, I got to Chronosphere and I was like, “Show me the cloud bill.” And I was like, “Do we have Reserved Instances?” Like, “Are we doing Committed Use Discounts with Google?”
I just needed to know. And then that helped. Okay, I got a look at the cloud bill. I felt a little better. I made a few optimizations and then I got back to my actual job which was, you know, running product marketing for Chronosphere. And I still love to jump in and just make just a little recommendation here and there. Like, “Oh, I noticed the costs are creeping up on this. Did we consider this?”
Corey: Oh, I still get a kick out of that where I was talking to an Amazonian whose side project was 110 bucks a month, and he’s like, yeah, I don’t think you could do much over here. It’s like, “Mmm, I’ll bet you a drink I can.”—
Rachel: Challenge accepted.
Corey: —it’s like, “All right. You’re on.” Cut it to 40 bucks. And he’s like, “How did you do that?” It’s because I know what I’m doing and this pattern repeats.
And it’s, are the architectural misconfigurations bounded by contacts that turn into so much. And I still maintain that I can look at the AWS bill for most environments for last month and have a pretty good idea, based upon nothing other than that, what’s going on in the environment. It turns out that maybe that’s a relatively crappy observability system when all is said and done, but it tells an awful lot. I can definitely see the appeal of wanting to get away from purely cost-driven or cost-side information and into things that give a lot more context into how things are behaving, how they’re performing. I think there’s been something of an industry rebrand away from monitoring, alerting, and trending over time to calling it observability.
And I know that people are going to have angry opinions about that—and it’s imperative that you not email me—but it all is getting down to the same thing of is my site up or down? Or in larger distributed systems, how down is it? And I still think we’re learning an awful lot. I cringe at the early days of Nagios when that was what I was depending upon to tell me whether my site was up or not. And oh, yeah, turns out that when the Nagios server goes down, you have some other problems you need to think about. It became this iterative, piling up on and piling up on and piling up on until you can get sort of good at it.
But the entire ecosystem around understanding what’s going on in your application has just exploded since the last time I was really
running production sites of any scale, in anger. So, it really would be a different world today.
Rachel: It’s changing so fast and that’s part of what makes it really exciting. And the other big thing that I love about this is, like, this is a must-have. This is not table stakes. This is not optional. Like, a great observability solution is the difference between conquering a market or being overrun.
If you look at what our founders—our founders at Chronosphere came from Uber, right? They ran the observability team at Uber. And they truly believe—and I believe them, too—that this was a competitive advantage for them. The fact that you could go to Uber and it’s always up and it’s always running and you know you’re not going to have an issue, that became an advantage to them that helped them conquer new markets. We do the same thing for our customers.
Corey: The entire idea around how these things are talked about in terms of downtime and the rest is just sort of ludicrous, on some level, because we take specific cases as industry truths. Like, I still remember, when Amazon was down one day when I was trying to buy a pair of underwear. And by that theory, it was—great, I hit a 404 page and a picture of a dog. Well, according to a lot of these industry truisms, then, well, one day a week for that entire rotation of underpants, I should have just been not wearing any. But no here in reality, I went back an hour later and bought underpants.
Now, counterpoint: If every third time I wound up trying to check out at Amazon, I wound up hitting that error page, I would spend a lot more money at Target. There is a point at which repeated downtime comes at a cost. But one-offs for some businesses are just fine. Counterpoint with if Uber is down when you’re trying to get a ride, well, that ride [unintelligible 00:28:36] may very well be lost for them and there is a definitive cost. No one’s going to go back and click on an ad as well, for example, and Amazon is increasingly an advertising company.
So, there’s a lot of nuance to it. I think we can generally say that across the board, in most cases, downtime bad. But as far as how much that is and what form that looks like and what impact that has on your company, it really becomes situationally dependent.
Rachel: I’m just going to gloss over the fact that you buy your underwear on Amazon and really not make any commentary on that. But I mean—
Corey: They sell everything there. And the problem, of course, is the crappy counterfeit underwear under the Amazon Basics brand that they ripped off from the good underwear brands. But that’s a whole ‘nother kettle of wax for a different podcast.
Rachel: Yep. Once again, not making any commentary on your—on that. Sorry, I lost my train of thought. I work in my dining room. My husband, my dog are all just—welcome to pandemic life here.
Corey: No, it’s fair. They live there. We don’t, as a general rule.
Rachel: [laugh]. Very true. Yeah. You’re not usually in my dining room, all of you but—oh, so uptime downtime, also not such a simple conversation, right? It’s not like all of Amazon is down or all of DoorDash is down. It might just be one individual service or one individual region or something that is—
Corey: One service in one subset of one availability zone. And this is the problem. People complain about the Amazon status page, but if every time something was down, it reflected there, you’d see a never ending sea of red, and that would absolutely erode confidence in the platform. Counterpoint when things are down for you and it’s not red. It’s maddening. And there’s no good answer.
Rachel: No. There’s no good answer. There’s no good answer. And the [laugh] yeah, the Amazon status page. And this is something I—bringing me back to my Forrester days, availability and resiliency in the cloud was one of the areas I focused on.
And, you know, this was once again, early days of public cloud, but remember when Netflix went down on Christmas Eve, and—God, what year was this? Maybe… 2012, and that was the worst possible time they could have had downtime because so many people are with their families watching their Doctor Who Christmas Specials, which is what I was trying to watch at the time.
Corey: Yeah, now you can’t watch it. You have to actually talk to those people, and none of us can stand them. And oh, dear Lord, yeah—
Rachel: What a nightmare.
Corey: —brutal for the family dynamic. Observability is one of those things as well that unlike you know, the AWS bill, it’s very easy to explain to people who are not deep in the space where it’s, “Oh, great. Okay. So, you have a website. It goes well. Then you want—it gets slow, so you put it on two computers. Great. Now, it puts on five computers. Now, it’s on 100 computers, half on the East Coast, half on the West Coast. Two of those computers are down. How do you tell?”
And it turns in—like, they start to understand the idea of understanding what’s going on in a complex system. “All right, how many people work at your company?” “2000,” “Great. Three laptops are broken. How do you figure out which ones are broken?” If you’re one of the people with a broken laptop, how do you figure out whether it’s your laptop or the entire system? And it lends itself really well to analogies, whereas if I’m not careful when I describe what I do, people think I can get them a better deal on underpants. No, not that kind of Amazon bill. I’m sorry.
Rachel: [laugh]. Yeah, or they started to think that you’re some kind of accountant or a tax advisor, but.
Corey: Which I prefer, as opposed to people at neighborhood block parties thinking that I’m the computer guy because then it’s, “Oh, I’m having trouble with the printer.” It’s, “Great. Have you tried [laugh] throwing away and buying a new one? That’s what I do.”
Rachel: This is a huge problem I have in my life of everyone thinking I’m going to fix all of their computer and cloud things. And I come from a big tech family. My whole family is in tech, yet somehow I’m the one at family gatherings doing, “Did you turn it off and turn it back on again?” Like, somehow that’s become my job.
Corey: People get really annoyed when you say that and even more annoyed when it fixes the problem.
Rachel: Usually does. So, the thread I wanted to pick back up on though before I got distracted by my husband and dog wandering around—at least my son is not in the room with us because he’d have a lot to say—is that the standard industry definition of observability—so once again, people are going to write to us, I’m sure; they can write to me, not you, Corey, about observability, it’s just the latest buzzword. It’s just monitoring, or you know—
Corey: It’s hipster monitoring.
Rachel: Hipster monitoring. That’s what you like to call it. I don’t really care what we call it. The important thing is it gets us through three phases, right? The first is knowing that something is wrong. If you don’t know what’s wrong, how are you supposed to ever go fix it, right? So, you need to know that those three laptops are broken.
The next thing is you need to know how bad is it? Like, if those three laptops are broken is the CEO, the COO, and the CRO, that’s real bad. If it’s three, you know, random peons in marketing, maybe not so bad. So, you need to triage, you need to understand roughly, like, the order of magnitude of it, and then you need to fix it. [laugh].
Once you fix it, you can go back and then say, all right, what was the root cause of this? How do we make sure this doesn’t happen again? So, the way you go through that cycle, you’re going to use metrics, you might use logs, you might use traces, but that’s not the definition of observability. Observability is all about getting through that, know, then triage, then fix it, then understand.
Corey: I really want to thank you for taking the time to speak with me today. If people do want to learn more, give you their unfiltered opinions, where’s the best place to find you?
Rachel: Well, you can find me on Twitter, I’m @RachelDines. You can also email me, rachel@chronosphere.io. I hope I don’t regret giving out that email address. That’s a good way you can come and argue with me about what is observability. I will not be giving advice on cloud bills. For that, you should go to Corey. But yeah, that’s a good way to get in touch.
Corey: Thank you so much for your time. I really appreciate it.
Rachel: Yeah, thank you.
Corey: Rachel Dines, Head of Product and Technical Marketing at Chronosphere. I’m Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you’ve enjoyed this podcast, please leave a five-star review on your podcast platform of choice whereas if you’ve hated this podcast, please leave a five-star review on your podcast platform of choice, and castigate me with an angry comment telling me that I really should have followed the thread between the obvious link between art history and AWS billing, which is almost certainly a more disturbing Caravaggio.
Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.
Announcer: This has been a HumblePod production. Stay humble.
Join our newsletter
2021 Duckbill Group, LLC