The re:Invent Wheel in the Sky Keeps on Turning with Pete Cheslock

Pete Cheslock, Cloud Economist and Duckbill Group alum, sits down with Corey for their annual tradition of re:Invent re:Cap. After 2020’s generally crappy virtual event, and this years arguable awkward “hybrid” event, Pete and Corey have many coals to rake over. Pete and Corey talk about the general weirdness of the 2021 re:Invent amidst a still on going pandemic. One topic of conversation is the relentless insistence of certain companies to do what the customer asks, without even asking if they should. Pete expounds on some of the offerings to come out of re:Invent, and he and Corey try to sift through the chaff for some of the wheat.

Announcer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.

Corey: This episode is sponsored in part byLaunchDarkly. Take a look at what it takes to get your code into production. I’m going to just guess that it’s awful because it’s always awful. No one loves their deployment process. What if launching new features didn’t require you to do a full-on code and possibly infrastructure deploy? What if you could test on a small subset of users and then roll it back immediately if results aren’t what you expect? LaunchDarkly does exactly this. To learn more, visitlaunchdarkly.com and tell them Corey sent you, and watch for the wince.

Corey: This episode is sponsored in part by our friends at Redis, the company behind the incredibly popular open source database that is not the bind DNS server. If you’re tired of managing open source Redis on your own, or you’re using one of the vanilla cloud caching services, these folks have you covered with the go to manage Redis service for global caching and primary database capabilities; Redis Enterprise. To learn more and deploy not only a cache but a single operational data platform for one Redis experience, visit redis.com/hero. Thats r-e-d-i-s.com/hero. And my thanks to my friends at Redis for sponsoring my ridiculous non-sense.

Corey: Welcome to Screaming in the Cloud. I’m Corey Quinn. I am joined—as is tradition, for a post re:Invent wrap up, a month or so later, once everything is time to settle—by my friend and yours, Pete Cheslock. Pete, how are you?

Pete: Hi, I’m doing fantastic. New year; new me. That’s what I’m going with.

Corey: That’s the problem. I keep hoping for that, but every time I turn around, it’s still me. And you know, honestly, I wouldn’t wish that on anyone.

Pete: Exactly. [laugh]. I wouldn’t wish you on me either. But somehow I keep coming back for this.

Corey: So, in two-thousand twenty—or twenty-twenty, as the children say—re:Invent was fully virtual. And that felt weird. Then re:Invent 2021 was a hybrid event which, let’s be serious here, is not really those things. They had a crappy online thing and then a differently crappy thing in person. But it didn’t feel real to me because you weren’t there.

That is part of the re:Invent tradition. There’s a midnight madness thing, there’s a keynote where they announce a bunch of nonsense, and then Pete and I go and have brunch on the last day of re:Invent and decompress, and more or less talk smack about everything that crosses our minds. And you weren’t there this year. I had to backfill you with Tim Banks. You know, the person that I backfield you with here at The Duckbill Group as a principal cloud economist.

Pete: You know, you got a great upgrade in hot takes, I feel like, with Tim.

Corey: And other ways, too, but it’s rude of me to say that to you directly. So yeah, his hot takes are spectacular. He was going to be doing this with me, except you cannot mess with tradition. You really can’t.

Pete: Yeah. I’m trying to think how many—is this third year? It’s at least three.

Corey: Third or fourth.

Pete: Yeah, it’s at least three. Yeah, it was, I don’t want to say I was sad to not be there because, with everything going on, it’s still weird out there. But I am always—I’m just that weird person who actually likes re:Invent, but not for I feel like the reasons people think. Again, I’m such an extroverted-type person, that it’s so great to have this, like, serendipity to re:Invent. The people that you run into and the conversations that you have, and prior—like in 2019, I think was a great example because that was the last one I had gone to—you know, having so many conversations so quickly because everyone is there, right? It’s like this magnet that attracts technologists, and venture capital, and product builders, and all this other stuff. And it’s all compressed into, like, you know, that five-day span, I think is the biggest part that makes so great.

Corey: The fear in people’s eyes when they see me. And it was fun; I had a pair of masks with me. One of them was a standard mask, and no one recognizes anyone because, masks, and the other was a printout of my ridiculous face, which was horrifyingly uncanny, but also made it very easy for people to identify me. And depending upon how social I was feeling, I would wear one or the other, and it worked flawlessly. That was worth doing. They really managed to thread the needle, as well, before Omicron hit, but after the horrors of last year. So, [unintelligible 00:03:00]—

Pete: It really—

Corey: —if it were going on right now, it would not be going on right now.

Pete: Yeah. I talk about really—yeah—really just hitting it timing-wise. Like, not that they could have planned for any of this, but like, as things were kind of not too crazy and before they got all crazy again, it feels like wow, like, you know, they really couldn’t have done the event at any other time. And it’s like, purely due to luck. I mean, absolute one hundred percent.

Corey: That’s the amazing power of frugality. Because the reason is then is it’s the week after Thanksgiving every year when everything is dirt cheap. And, you know, if there’s one thing that I one-point-seve—sorry, their stock’s in the toilet—a $1.6 trillion company is very concerned about, it is saving money at every opportunity.

Pete: Well, the one thing that was most curious about—so I was at the first re:Invent in-what—2012 I think it was, and there was—it was quaint, right?—there was 4000 people there, I want to say. It was in the thousands of people. Now granted, still a big conference, but it was in the Sands Convention Center. It was in that giant room, the same number of people, were you know, people’s booths were like tables, like, eight-by-ten tables, right? [laugh].

It had almost a DevOpsDays feel to it. And I was kind of curious if this one had any of those feelings. Like, did it evoke it being more quaint and personable, or was it just as soulless as it probably has been in recent years?

Corey: This was fairly soulless because they reduced the footprint of the event. They dropped from two expo halls down to one, they cut the number of venues, but they still had what felt like 20,000 people or something there. It was still crowded, it was still packed. And I’ve done some diligent follow-ups afterwards, and there have been very few cases of Covid that came out of it. I quarantined for a week in a hotel, so I don’t come back and kill my young kids for the wrong reasons.

And that went—that was sort of like the worst part of it on some level, where it’s like great. Now I could sit alone at a hotel and do some catch-up and all the rest, but all right I’d kind of like to go home. I’m not used to being on the road that much.

Pete: Yeah, I think we’re all a little bit out of practice. You know, I haven’t been on a plane in years. I mean, the travel I’ve done more recently has been in my car from point A to point B. Like, direct, you know, thing. Actually, a good friend of mine who’s not in technology at all had to travel for business, and, you know, he also has young kids who are under five, so he when he got back, he actually hid in a room in their house and quarantine himself in the room. But they—I thought, this is kind of funny—they never told the kids he was home. Because they knew that like—

Corey: So, they just thought the house was haunted?

Pete: [laugh].

Corey: Like, “Don’t go in the west wing,” sort of level of nonsense. That is kind of amazing.

Pete: Honestly, like, we were hanging out with the family because they’re our neighbors. And it was like, “Oh, yeah, like, he’s in the guest room right now.” Kids have no idea. [laugh]. I’m like, “Oh, my God.” I’m like, I can’t even imagine. Yeah.

Corey: So, let’s talk a little bit about the releases of re:Invent. And I’m going to lead up with something that may seem uncharitable, but I don’t think it necessarily is. There weren’t the usual torrent of new releases for ridiculous nonsense in the same way that there have been previously. There was no, this service talks to satellites in space. I mean, sure, there was some IoT stuff to manage fleets of cars, and giant piles of robots, and cool, I don’t have those particular problems; I’m trying to run a website over here.

So okay, great. There were enhancements to a number of different services that were in many cases appreciated, in other cases, irrelevant. Werner said in his keynote, that it was about focusing on primitives this year. And, “Why do we have so many services? It’s because you asked for it… as customers.”

Pete: [laugh]. Yeah, you asked for it.

Corey: What have you been asking for, Pete? Because I know what I’ve been asking for and it wasn’t that. [laugh].

Pete: It’s amazing to see a company continually say yes to everything, and somehow, despite their best efforts, be successful at doing it. No other company could do that. Imagine any other software technology business out there that just builds everything the customers ask for.
Like from a product management business standpoint, that is, like, rule 101 is, “Listen to your customers, but don’t say yes to everything.” Like,
you can’t do everything.

Corey: Most companies can’t navigate the transition between offering the same software in the Cloud and on a customer facility. So, it’s like, “Ooh, an on-prem version, I don’t know, that almost broke the company the last time we tried it.” Whereas you have Amazon whose product
strategy is, “Yes,” being able to put together a whole bunch of things. I also will challenge the assertion that it’s the primitives that customers want. They don’t want to build a data center out of popsicle sticks themselves. They want to get something that solves a problem.

And this has been a long-term realization for me. I used to work at Media Temple as a senior systems engineer running WordPress at extremely large scale. My websites now run on WordPress, and I have the good sense to pay WP Engine to handle it for me, instead of doing it myself because it’s not the most productive use of my time. I want things higher up the stack. I assure you I pay more to WP Engine than it would cost me to run these things myself from an infrastructure point of view, but not in terms of my time.

What I see sometimes as the worst of all worlds is that AWS is trying to charge for that value-added pricing without adding the value that goes along with it because you still got to build a lot of this stuff yourself. It’s still a very janky experience, you’re reduced to googling random blog posts to figure out how this thing is supposed to work, and the best documentation comes from externally. Whereas with a company that’s built around offering solutions like this, great. In the fullness of time, I really suspect that if this doesn’t change, their customers are going to just be those people who build solutions out of these things. And let those companies capture the up-the-stack margin. Which I have no problem with. But they do because Amazon is a company that lies awake at night actively worrying that someone, somewhere, who isn’t them might possibly be making money somehow.

Pete: I think MongoDB is a perfect example of—like, look at their stock price over the last whatever, years. Like, they, I feel like everyone called for the death of MongoDB every time Amazon came out with their new things, yet, they’re still a multi-billion dollar company because I can just—give me an API endpoint and you scale the database. There’s is—

Corey: Look at all the high-profile hires that Mongo was making out of AWS, and I can’t shake the feeling they’re sitting there going, “Yeah, who’s losing important things out of production now?” It’s, everyone is exodus-ing there. I did one of those ridiculous graphics of the naming all the people that went over there, and in—with the hurricane evacuation traffic picture, and there’s one car going the other way that I just labeled with, “Re:Invent sponsorship check,” because yeah, they have a top tier sponsorship and it was great. I’ve got to say I’ve been pretty down on MongoDB for a while, for a variety of excellent reasons based upon, more or less, how they treated customers who were in pain. And I’d mostly written it off.

I don’t do that anymore. Not because I inherently believe the technology has changed, though I’m told it has, but by the number of people who I deeply respect who are going over there and telling me, no, no, this is good. Congratulations. I have often said you cannot buy authenticity, and I don’t think that they are, but the people who are working there, I do not believe that these people are, “Yeah, well, you bought my opinion. You can buy their attention, not their opinion.” If someone changes their opinion, based upon where they work, I kind of question everything they’re telling me is, like, “Oh, you’re just here to sell something you don’t believe in? Welcome aboard.”

Pete: Right. Yeah, there’s an interview question I like to ask, which is, “What’s something that you used to believe in very strongly that you’ve more recently changed your mind on?” And out of politeness because usually throws people back a little bit, and they’re like, “Oh, wow. Like, let me think about that.” And I’m like, “Okay, while you think about that I want to give you mine.”

Which is in the past, my strongly held belief was we had to run everything ourselves. “You own your availability,” was the line. “No, I’m not buying Datadog. I can build my own metric stack just fine, thank you very much.” Like, “No, I’m not going to use these outsourced load balancers or databases because I need to own my availability.”

And what I realized is that all of those decisions lead to actually delivering and focusing on things that were not the core product. And so now, like, I’ve really flipped 180, that, if any—anything that you’re building that does not directly relate to the core product, i.e. How your business makes money, should one hundred percent be outsourced to an expert that is better than you. Mongo knows how to run Mongo better than
you.

Corey: “What does your company do?” “Oh, we handle expense reports.” “Oh, what are you working on this month?” “I’m building a load balancer.” It’s like that doesn’t add the value. Don’t do that.

Pete: Right. Exactly. And so it’s so interesting, I think, to hear Werner say that, you know, we’re just building primitives, and you asked for this. And I think that concept maybe would work years ago, when you had a lot of builders who needed tools, but I don’t think we have any, like, we don’t have as many builders as before. Like, I think we have people who need more complete solutions. And that’s probably why all these businesses are being super successful against Amazon.

Corey: I’m wondering if it comes down to a cloud economic story, specifically that my cloud bill is always going to be variable and it’s difficult to predict, whereas if I just use EC2 instances, and I build load balancers or whatnot, myself, well, yeah, it’s a lot more work, but I can predict accurately what my staff compensation costs are more effectively, that I can predict what a CapEx charge would be or what the AWS bill is going to be. I’m wondering if that might in some way shape it?

Pete: Well, I feel like the how people get better in managing their costs, right, you’ll eventually move to a world where, like, “Yep, okay, first, we turned off waste,” right? Like, step one is waste. Step two is, like, understanding your spend better to optimize but, like, step three, like, the galaxy brain meme of Amazon cost stuff is all, like, unit economics stuff, where trying to better understand the actual cost deliver an actual feature. And yeah, I think that actually gets really hard when you give—kind of spread your product across, like, a slew of services that have varying levels of costs, varying levels of tagging, so you can attribute it. Like, it’s really hard. Honestly, it’s pretty easy if I have 1000 EC2 servers with very specific tags, I can very easily figure out what it costs to deliver product. But if I have—

Corey: Yeah, if I have Corey build it, I know what Corey is going to cost, and I know how many servers he’s going to use. Great, if I have Pete it, Pete’s good at things, it’ll cut that server bill in half because he actually knows how to wind up being efficient with things. Okay, great. You can start calculating things out that way. I don’t think that’s an intentional choice that companies are making, but I feel like that might be a natural outgrowth of it.

Pete: Yeah. And there’s still I think a lot of the, like, old school mentality of, like, the, “Not invented here,” the, “We have to own our availability.” You can still own your availability by using these other vendors. And honestly, it’s really heartening to see so many companies realize that and realize that I don’t need to get everything from Amazon. And honestly, like, in some things, like I look at a cloud Amazon bill, and I think to myself, it would be easier if you just did everything from Amazon versus having these ten other vendors, but those ten other vendors are going to be a lot better at running the product that they build, right, that as a service, then you probably will be running it yourself. Or even
Amazon’s, like, you know, interpretation of that product.

Corey: A few other things that came out that I thought were interesting, at least the direction they’re going in. The changes to S3 intelligent tiering are great, with instant retrieval on Glacier. I feel like that honestly was—they talk a good story, but I feel like that was competitive response to Google offering the same thing. That smacks of a large company with its use case saying, “You got two choices here.” And they’re like, “Well, okay. Crap. We’re going to build it then.”

Or alternately, they’re looking at the changes that they’re making to intelligent tiering, they’re now shifting that to being the default that as far as recommendations go. There are a couple of drawbacks to it, but not many, and it’s getting easier now to not have the mental overhead of trying to figure out exactly what your lifecycle policies are. Yeah, there are some corner cases where, okay, if I adjust this just so, then I could save 10% on that monitoring fee or whatnot. Yeah, but look how much work that’s going to take you to curate and make sure that you’re not doing something silly. That feels like it is such an in the margins issue. It’s like, “How much data you’re storing?” “Four exabytes.” Okay, yeah. You probably want some people doing exactly that, but that’s not most of us.

Pete: Right. Well, there’s absolutely savings to be had. Like, if I had an exabyte of data on S3—which there are a lot of people who have that level of data—then it would make sense for me to have an engineering team whose sole purpose is purely an optimizing our data lifecycle for that data. Until a point, right? Until you’ve optimized the 80%, basically. You optimize the first 80, that’s probably, air-quote, “Easy.” The last 20 is going to be incredibly hard, maybe you never even do that.

But at lower levels of scale, I don’t think the economics actually work out to have a team managing your data lifecycle of S3. But the fact that now AWS can largely do it for you in the background—now, there’s so many things you have to think about and, like, you know, understand even what your data is there because, like, not all data is the same. And since S3 is basically like a big giant database you can query, you got to really think about some of that stuff. But honestly, what I—I don’t know if—I have no idea if this is even be worked on, but what I would love to see—you know, hashtag #AWSwishlist—is, now we have countless tiers of EBS volumes, EBS volumes that can be dynamically modified without touching, you know, the physical host. Meaning with an API call, you can change from the gp2 to gp3, or io whatever, right?

Corey: Or back again if it doesn’t pan out.

Pete: Or back again, right? And so for companies with large amounts of spend, you know, economics makes sense that you should have a team that is analyzing your volumes usage and modifying that daily, right? Like, you could modify that daily, and I don’t know if there’s anyone out there that’s actually doing it at that level. And they probably should. Like, if you got millions of dollars in EBS, like, there’s legit savings that you’re probably leaving on the table without doing that. But that’s what I’m waiting for Amazon to do for me, right? I want intelligent tiering for EBS because if you’re telling me I can API call and you’ll move my data and make that better, make that [crosstalk 00:17:46] better [crosstalk 00:17:47]—

Corey: Yeah it could be like their auto-scaling for DynamoDB, for example. Gives you the capacity you need 20 minutes after you needed it. But fine, whatever because if I can schedule stuff like that, great, I know what time of day, the runs are going to kick off that beat up the disks. I know when end-of-month reporting fires off. I know what my usage pattern is going to be, by and large.

Yeah, part of the problem too, is that I look at this stuff, and I get excited about it with the intelligent tiering… at The Duckbill Group we’ve got a few hundred S3 buckets lurking around. I’m thinking, “All right, I’ve got to go through and do some changes on this and implement all of that.” Our S3 bill’s something like 50 bucks a month or something ridiculous like that. It’s a no, that really isn’t a thing. Like, I have a screenshot bucket that I have an app installed—I think called Dropshare—that hooks up to anytime I drag—I hit a shortcut, I drag with the mouse to select whatever I want and boom, it’s up there and the URL is not copied to my clipboard, I can paste that wherever I want.

And I’m thinking like, yeah, there’s no cleanup on that. There’s no lifecycle policy that’s turning into anything. I should really go back and age some of it out and do the rest and start doing some lifecycle management. It—I’ve been using this thing for years and I think it’s now a whopping, what, 20 cents a month for that bucket. It’s—I just don’t—

Pete: [laugh].

Corey: —I just don’t care, other than voice in the back of my mind, “That’s an unbounded growth problem.” Cool. When it hits 20 bucks a month, then I’ll consider it. But until then I just don’t. It does not matter.

Pete: Yeah, I think yeah, scale changes everything. Start adding some zeros and percentages turned into meaningful numbers. And honestly, back on the EBS thing, the one thing that really changed my perspective of EBS, in general, is—especially coming from the early days, right? One terabyte volume, it was a hard drive in a thing. It was a virtual LUN on a SAN somewhere, probably.

Nowadays, and even, like, many years after those original EBS volumes, like all the limits you get in EBS, those are actually artificial limits, right? If you’re like, “My EBS volume is too slow,” it’s not because, like, the hard drive it’s on is too slow. That’s an artificial limit that is likely put in place due to your volume choice. And so, like, once you realize that in your head, then your concept of how you store data on EBS should change dramatically.

Corey: Oh, AWS had a blog post recently talking about, like, with io2 and the limits and everything, and there was architecture thinking, okay. “So, let’s say this is insufficient and the quarter-million IOPS a second that you’re able to get is not there.” And I’m sitting there thinking, “That is just ludicrous data volume and data interactivity model.” And it’s one of those, like, I’m sitting here trying to think about, like, I haven’t had to deal with a problem like that decade, just because it’s, “Huh. Turns out getting these one thing that’s super fast is kind of expensive.” If you paralyze it out, that’s usually the right answer, and that’s how the internet is mostly evolved. But there are use cases for which that doesn’t work, and I’m excited to see it. I don’t want to pay for it in my view, but it’s nice to see it.

Pete: Yeah, it’s kind of fun to go into the Amazon calculator and price out one of the, like, io2 volumes and, like, maxed out. It’s like, I don’t know, like $50,000 a month or a hun—like, it’s some just absolutely absurd number. But the beauty of it is that if you needed that value for an hour to run some intensive data processing task, you can have it for an hour and then just kill it when you’re done, right? Like, that is what is most impressive.

Corey: I copied 130 gigs of data to an EFS volume, which was—[unintelligible 00:21:05] EFS has gone from “This is a piece of junk,” to one of my favorite services. It really is, just because of its utility and different ways of doing things. I didn’t have the foresight, just use a second EFS volume for this. So, I was unzipping a whole bunch of small files onto it. Great.

It took a long time for me to go through it. All right, now that I’m done with that I want to clean all this up. My answer was to ultimately spin up a compute node and wind up running a whole bunch of—like, 400, simultaneous rm-rf on that long thing. And it was just, like, this feels foolish and dumb, but here we are. And I’m looking at the stats on it because the instance was—all right, at that point, the load average [on the instance 00:21:41] was like 200, or something like that, and the EFS volume was like, “Ohh, wow, you’re really churning on this. I’m now at, like, 5% of the limit.” Like, okay, great. It turns out I’m really bad at computers.

Pete: Yeah, well, that’s really the trick is, like, yeah, sure, you can have a quarter-million IOPS per second, but, like, what’s going to break before you even hit that limit? Probably many other things.

Corey: Oh, yeah. Like, feels like on some level if something gets to that point, it a misconfiguration somewhere. But honestly, that’s the thing I find weirdest about the world in which we live is that at a small-scale—if I have a bill in my $5 a month shitposting account, great. If I screw something up and cost myself a couple hundred bucks in misconfiguration it’s going to stand out. At large scale, it doesn’t matter if—you’re spending $50 million a year or $500 million a year on AWS and someone leaks your creds, and someone spins up a whole bunch of Bitcoin miners somewhere else, you’re going to see that on your bill until they’re mining basically all the Bitcoin. It just gets lost in the background.

Pete: I’m waiting for those—I’m actually waiting for the next level of them to get smarter because maybe you have, like, an aggressive tagging system and you’re monitoring for untagged instances, but the move here would be, first get the creds and query for, like, the most used tags and start applying those tags to your Bitcoin mining instances. My God, it’ll take—

Corey: Just clone a bunch of tags. Congratulations, you now have a second BI Elasticsearch cluster that you’re running yourself. Good work.

Pete: Yeah. Yeah, that people won’t find that until someone comes along after the fact that. Like, “Why do we have two have these things?” And you’re like—[laugh].

Corey: “Must be a DR thing.”

Pete: It’s maxed-out CPU. Yeah, exactly.

Corey: [laugh].

Pete: Oh, the terrible ideas—please, please, hackers don’t take are terrible ideas.

Corey: I had a, kind of, whole thing I did on Twitter years ago, talking about how I would wind up using the AWS Marketplace for an embezzlement scheme. Namely, I would just wind up spinning up something that had, like, a five-cent an hour charge or whatnot on just, like, basically rebadge the CentOS Community AMI or whatnot. Great. And then write a blog post, not attached to me, that explains how to do a thing that I’m going to be doing in production in a week or two anyway. Like, “How to build an auto-scaling group,” and reference that AMI.

Then if it ever comes out, like, “Wow, why are we having all these marketplace charges on this?” “I just followed the blog post like it said here.” And it’s like, “Oh, okay. You’re a dumbass. The end.”

That’s the way to do it. A month goes by and suddenly it came out that someone had done something similarly. They wound up rebadging these community things on the marketplace and charging big money for it, and I’m sitting there going like that was a joke. It wasn’t a how-to. But yeah, every time I make these jokes, I worry someone’s going to do it.

Pete: “Welcome to large-scale fraud with Corey Quinn.”

Corey: Oh, yeah, it’s fraud at scale is really the important thing here.

Corey: This episode is sponsored by our friends at Oracle HeatWave is a new high-performance accelerator for the Oracle MySQL Database Service. Although I insist on calling it “my squirrel.” While MySQL has long been the worlds most popular open source database, shifting from transacting to analytics required way too much overhead and, ya know, work. With HeatWave you can run your OLTP and OLAP, don’t ask me to ever say those acronyms again, workloads directly from your MySQL database and eliminate the time consuming data movement and integration work, while also performing 1100X faster than Amazon Aurora, and 2.5X faster than Amazon Redshift, at a third of the cost. My thanks again to Oracle Cloud for sponsoring this ridiculous nonsense.

Corey: I still remember a year ago now at re:Invent 2021 was it, or was it 2020? Whatever they came out with, I want to say it wasn’t gp3, or maybe it was, regardless, there was a new EBS volume type that came out that you were playing with to see how it worked and you experimented with it—

Pete: Oh, yes.

Corey: —and the next morning, you looked at the—I checked Slack and you’re like well, my experiments yesterday cost us $5,000. And at first, like, the—my response is instructive on this because, first, it was, “Oh, my God. What’s going to happen now?” And it’s like, first, hang on a
second.

First off, that seems suspect but assume it’s real. I assumed it was real at the outset. It’s “Oh, right. This is not my personal $5-a-month toybox account. We are a company; we can absolutely pay that.” Because it’s like, I could absolutely reach out, call it a favor. “I made a mistake, and I need a favor on the bill, please,” to AWS.

And I would never live it down, let’s be clear. For a $7,000 mistake, I would almost certainly eat it. As opposed to having to prostrate myself like that in front of Amazon. I’m like, no, no, no. I want one of those like—if it’s like, “Okay, you’re going to, like, set back the company roadmap by six months if you have to pay this. Do you want to do it?” Like, [groans] “Fine, I’ll eat some crow.”

But okay. And then followed immediately by, wow, if Pete of all people can mess this up, customers are going to be doomed here. We should figure out what happened. And I’m doing the math. Like, Pete, “What did you actually do?” And you’re sitting there and you’re saying, “Well, I had like a 20 gig volume that I did this.” And I’m doing the numbers, and it’s like—

Pete: Something’s wrong.

Corey: “How sure are you when you say ‘gigabyte,’ that you were—that actually means what you think it did? Like, were you off by a lot? Like, did you mean exabytes?” Like, what’s the deal here?

Pete: Like, multiple factors.

Corey: Yeah. How much—“How many IOPS did you give that thing, buddy?” And it turned out what happened was that when they launched this, they had mispriced it in the system by a factor of a million. So, it was fun. I think by the end of it, all of your experimentation was somewhere between five to seven cents. Which—

Pete: Yeah. It was a—

Corey: Which is why you don’t work here anymore because no one cost me seven cents of money to give to Amazon—

Pete: How dare you?

Corey: —on my watch. Get out.

Pete: How dare you, sir?

Corey: Exactly.

Pete: Yeah, that [laugh] was amazing to see, as someone who has done—definitely maid screw-ups that have cost real money—you know, S3 list requests are always a fun one at scale—but that one was supremely fun to see the—

Corey: That was a scary one because another one they’d done previously was they had messed up Lightsail pricing, where people would log in, and, like, “Okay, so what is my Lightsail instance going to cost?” And I swear to you, this is true, it was saying—this was back in 2017 or so—the answer was, like, “$4.3 billion.” Because when you see that you just start laughing because you know it’s a mistake. You know, that they’re not going to actually demand that you spend $4.3 billion for a single instance—unless it’s running SAP—and great.

It’s just, it’s a laugh. It’s clearly a mispriced, and it’s clearly a bug that’s going to get—it’s going to get fixed. I just spun up this new EBS volume that no one fully understands yet and it cost me thousands of dollars. That’s the sort of thing that no, no, I could actually see that happening. There are instances now that cost something like 100 bucks an hour or whatnot to run. I can see spinning up the wrong thing by mistake and getting bitten by it. There’s a bunch of fun configuration mistakes you can make that will, “Hee, hee, hee. Why can I see that bill spike from orbit?” And that’s the scary thing.

Pete: Well, it’s the original CI and CD problem of the per-hour billing, right? That was super common of, like, yeah, like, an i3, you know, 16XL server is pretty cheap per hour, but if you’re charged per hour and you spin up a bunch for five minutes. Like, it—you will be shocked [laugh] by what you see there. So—

Corey: Yeah. Mistakes will show. And I get it. It’s also people as individuals are very different psychologically than companies are. With companies it’s one of those, “Great we’re optimizing to bring in more revenue and we don’t really care about saving money at all costs.”

Whereas people generally have something that looks a lot like a fixed income in the form of a salary or whatnot, so it’s it is easier for us to cut spend than it is for us to go out and make more money. Like, I don’t want to get a second job, or pitch my boss on stuff, and yeah. So, all and all, routing out the rest of what happened at re:Invent, they—this is the problem is that they have a bunch of minor things like SageMaker Inference Recommender. Yeah, I don’t care. Anything—

Pete: [laugh].

Corey: —[crosstalk 00:28:47] SageMaker I mostly tend to ignore, for safety. I did like the way they described Amplify Studio because they made it sound like a WYSIWYG drag and drop, build a React app. It’s not it. It basically—you can do that in Figma and then it can hook it up to some things in some cases. It’s not what I want it to be, which is Honeycode, except good. But we’ll get there some year. Maybe.

Pete: There’s a lot of stuff that was—you know, it’s the classic, like, preview, which sure, like, from a product standpoint, it’s great. You know, they have a level of scale where they can say, “Here’s this thing we’re building,” which could be just a twinkle in a product managers, call it preview, and get thousands of people who would be happy to test it out and give you feedback, and it’s a, it’s great that you have that capability. But I often look at so much stuff and, like, that’s really cool, but, like, can I, can I have it now? Right? Like—or you can’t even get into the preview plan, even though, like, you have that specific problem. And it’s largely just because either, like, your scale isn’t big enough, or you don’t have a good enough relationship with your account manager, or I don’t know, countless other reasons.

Corey: The thing that really throws me, too, is the pre-announcements that come a year or so in advance, like, the Outpost smaller ones are finally available, but it feels like when they do too many pre-announcements or no big marquee service announcements, as much as they talk about, “We’re getting back to fundamentals,” no, you have a bunch of teams that blew the deadline. That’s really what it is; let’s not call it anything else. Another one that I think is causing trouble for folks—I’m fortunate in that I don’t do much work with Oracle databases, or Microsoft SQL databases—but they extended RDS Custom to Microsoft SQL at the [unintelligible 00:30:27] SQL server at re:Invent this year, which means this comes down to things I actually use, we’re going to have a problem because historically, the lesson has always been if I want to run my own databases and tweak everything, I do it on top of an EC2 instance. If I want to managed database, relational database service, great, I use RDS. RDS Custom basically gives you root into the RDS instance. Which means among other things, yes, you can now use RDS to run containers.

But it lets you do a lot of things that are right in between. So, how do you position this? When should I use RDS Custom? Can you give me an easy answer to that question? And they used a lot of words to say, no, they cannot. It’s basically completely blowing apart the messaging and positioning of both of those services in some unfortunate ways. We’ll learn as we go.

Pete: Yeah. Honestly, it’s like why, like, why would I use this? Or how would I use this? And this is I think, fundamentally, what’s hard when you just say yes to everything. It’s like, they in many cases, I don’t think, like, I don’t want to say they don’t understand why they’re doing this, but if it’s not like there’s a visionary who’s like, this fits into this multi-year roadmap.

That roadmap is largely—if that roadmap is largely generated by the customers asking for it, then it’s not like, oh, we’re building towards this Northstar of RDS being whatever. You might say that, but your roadmap’s probably getting moved all over the place because, you know, this company that pays you a billion dollars a year is saying, “I would give you $2 billion a year for all of my Oracle databases, but I need this specific thing.” I can’t imagine a scenario that they would say, “Oh, well, we’re building towards this Northstar, and that’s not on the way there.” Right? They’d be like, “New Northstar. Another billion dollars, please.”

Corey: Yep. Probably the worst release of re:Invent, from my perspective, is RUM, Real User Monitoring, for CloudWatch. And I, to be clear, I wrote a shitposting Twitter threading client called Last Tweet in AWS. Go to lasttweetinaws.com. You can all use it. It’s free; I just built this for my own purposes. And I’ve instrumented it with RUM. Now, Real User Monitoring is something that a lot of monitoring vendors use, and also CloudWatch now. And what that is, is it embeds a listener into the JavaScript that runs on client load, and it winds up looking at what’s going on loading times, et cetera, so you can see when users are unhappy. I have no problem with this. Other than that, you know, liking users? What’s up with that?

Pete: Crazy.

Corey: But then, okay, now, what this does is unlike every other RUM tool out there, which charges per session, meaning I am going to be… doing a web page load, it charges per data item, which includes HTTP errors, or JavaScript errors, et cetera. Which means that if you have a high transaction volume site and suddenly your CDN takes a nap like Fastly did for an hour last year, suddenly your bill is stratospheric for this because errors abound and cascade, and you can have thousands of errors on a single page load for these things, and it is going to be visible from orbit, at least with a per session basis thing, when you start to go viral, you understand that, “Okay, this is probably going to cost me some more on these things, and oops, I guess I should write less compelling content.” Fine. This is one of those one misconfiguration away and you are wailing and gnashing teeth. Now, this is a new service. I believe that they will waive these surprise bills in the event that things like that happen. But it’s going to take a while and you’re going to be worrying the whole time if you’ve rolled this out naively. So it’s—

Pete: Well and—

Corey: —I just don’t like the pricing.

Pete: —how many people will actively avoid that service, right? And honestly, choose a competitor because the competitor could be—the competitor could be five times more expensive, right, on face value, but it’s the certainty of it. It’s the uncertainty of what Amazon will charge you. Like, no one wants a surprise bill. “Well, a vendor is saying that they’ll give us this contract for $10,000. I’m going to pay $10,000, even though RUM might be a fraction of that price.”

It’s honestly, a lot of these, like, product analytics tools and monitoring tools, you’ll often see they price be a, like, you know, MAU, Monthly Active User, you know, or some sort of user-based pricing, like, the number of people coming to your site. You know, and I feel like at least then, if you are trying to optimize for lots of users on your site, and more users means more revenue, then you know, if your spend is going up, but your revenue is also going up, that’s a win-win. But if it’s like someone—you know, your third-party vendor dies and you’re spewing out errors, or someone, you know, upgraded something and it spews out errors. That no one would normally see; that’s the thing. Like, unless you’re popping open that JavaScript console, you’re not seeing any of those errors, yet somehow it’s like directly impacting your bottom line? Like that doesn’t feel [crosstalk 00:35:06].

Corey: Well, there is something vaguely Machiavellian about that. Like, “How do I get my developers to care about errors on consoles?” Like, how about we make it extortionately expensive for them not to. It’s, “Oh, all right, then. Here we go.”

Pete: And then talk about now you’re in a scenario where you’re working on things that don’t directly impact the product. You’re basically just sweeping up the floor and then trying to remove errors that maybe don’t actually affect it and they’re not actually an error.

Corey: Yeah. I really do wonder what the right answer is going to be. We’ll find out. Again, we live, we learn. But it’s also, how long does it take a service that has bad pricing at launch, or an unfortunate story around it to outrun that reputation?

People are still scared of Glacier because of its original restore pricing, which was non-deterministic for any sensible human being, and in
some cases lead to I’m used to spending 20 to 30 bucks a month on this. Why was I just charged two grand?

Pete: Right.

Corey: Scare people like that, they don’t come back.

Pete: I’m trying to actually remember which service it is that basically gave you an estimate, right? Like, turn it on for a month, and it would give you an estimate of how much this was going to cost you when billing started.

Corey: It was either Detective or GuardDuty.

Pete: Yeah, it was—yeah, that’s exactly right. It was one of those two. And honestly, that was unbelievably refreshing to see. You know, like, listen, you have the data, Amazon. You know what this is going to cost me, so when I, like, don’t make me spend all this time to go and figure out the cost. If you have all this data already, just tell me, right?

And if I look at it and go, “Yeah, wow. Like, turning this on in my environment is going to cost me X dollars. Like, yeah, that’s a trade-off I want to make, I’ll spend that.” But you know, with some of the—and that—a little bit of a worry on some of the intelligent tiering on S3 is that the recommendation is likely going to be everything goes to intelligent tiering first, right? It’s the gp3 story. Put everything on gp3, then move it to the proper volume, move it to an sc or an st or an io. Like, gp3 is where you start. And I wonder if that’s going to be [crosstalk 00:37:08].

Corey: Except I went through a wizard yesterday to launch an EC2 instance and its default on the free tier gp2.

Pete: Yeah. Interesting.

Corey: Which does not thrill me. I also still don’t understand for the life of me why in some regions, the free tier is a t2 instance, when t3 is available.

Pete: They’re uh… my guess is that they’ve got some free t—they got a bunch of t2s lying around. [laugh].

Corey: Well, one of the most notable announcements at re:Invent that most people didn’t pay attention to is their ability now to run legacy instance types on top of Nitro, which really speaks to what’s going on behind the scenes of we can get rid of all that old hardware and emulate the old m1 on modern equipment. So, because—you can still have that legacy, ancient instance, but now you’re going—now we’re able to wind up greening our data centers, which is part of their big sustainability push, with their ‘Sustainability Pillar’ for the well-architected framework. They’re talking more about what the green choices in cloud are. Which is super handy, not just because of the economic impact because we could use this pretty directly to reverse engineer their various margins on a per-service or per-offering basis. Which I’m not sure they’re aware of yet, but oh, they’re going to be.

And that really winds up being a win for the planet, obviously, but also something that is—that I guess puts a little bit of choice on customers. The challenge I’ve got is, with my serverless stuff that I build out, if I spend—the Google search I make to figure out what the most economic, most sustainable way to do that is, is going to have a bigger carbon impact on the app itself. That seems to be something that is important at scale, but if you’re not at scale, it’s one of those, don’t worry about it. Because let’s face it, the cloud providers—all of them—are going to have a better sustainability story than you are running this in your own data centers, or on a Raspberry Pi that’s always plugged into the wall.

Pete: Yeah, I mean, you got to remember, Amazon builds their own power plants to power their data centers. Like, that’s the level they play, right? There, their economies of scale are so entirely—they’re so entirely different than anything that you could possibly even imagine. So, it’s something that, like, I’m sure people will want to choose for. But, you know, if I would honestly say, like, if we really cared about our computing costs and the carbon footprint of it, I would love to actually know the carbon footprint of all of the JavaScript trackers that when I go to various news sites, and it loads, you know, the whatever thousands of trackers and tracking the all over, like, what is the carbon impact of some of those choices that I actually could control, like, as a either a consumer or business person?

Corey: I really hope that it turns into something that makes a meaningful difference, and it’s not just greenwashing. But we’ll see. In the fullness of time, we’re going to figure that out. Oh, they’re also launching some mainframe stuff. They—like that’s great.

Pete: Yeah, those are still a thing.

Corey: I don’t deal with a lot of customers that are doing things with that in any meaningful sense. There is no AWS/400, so all right.

Pete: [laugh]. Yeah, I think honestly, like, I did talk to a friend of mine who’s in a big old enterprise and has a mainframe, and they’re actually replacing their mainframe with Lambda. Like they’re peeling off—which is, like, a great move—taking the monolith, right, and peeling off the individual components of what it can do into these discrete Lambda functions. Which I thought was really fascinating. Again, it’s a five-year-long journey to do something like that. And not everyone wants to wait five years, especially if their support’s about to run out for that giant box in the, you know, giant warehouse.

Corey: The thing that I also noticed—and this is probably the—I guess, one of the—talk about swing and a miss on pricing—they have a—what is it?—there’s a VPC IP Address Manager, which tracks the the IP addresses assigned to your VPCs that are allocated versus not, and it’s 20 cents a month per IP address. It’s like, “Okay. So, you’re competing against a Google Sheet or an Excel spreadsheet”—which is what people are using for these things now—“Only you’re making it extortionately expensive?”

Pete: What kind of value does that provide for 20—I mean, like, again—

Corey: I think Infoblox or someone like that offers it where they become more cost-effective as soon as you hit 500 IP addresses. And it’s just
—like, this is what I’m talking about. I know it does not cost AWS that kind of money to store an IP address. You can store that in a Route 53 TXT record for less money, for God’s sake. And that’s one of those, like, “Ah, we could extract some value pricing here.”

Like, I don’t know if it’s a good product or not. Given its pricing, I don’t give a shit because it’s going to be too expensive for anything beyond trivial usage. So, it’s a swing and a miss from that perspective. It’s just, looking at that, I laugh, and I don’t look at it again.

Pete: See I feel—

Corey: I’m not usually price sensitive. I want to be clear on that. It’s just, that is just Looney Tunes, clown shoes pricing.

Pete: Yeah. It’s honestly, like, in many cases, I think the thing that I have seen, you know, in the past few years is, in many cases, it can honestly feel like Amazon is nickel-and-diming their customers in so many ways. You know, the explosion of making it easy to create multiple Amazon accounts has a direct impact to waste in the cloud because there’s a lot of stuff you have to have her account. And the more accounts you have, those costs grow exponentially as you have these different places. Like, you kind of lose out on the economies of scale when you have a smaller number of accounts.

And yeah, it’s hard to optimize for that. Like, if you’re trying to reduce your spend, it’s challenging to say, “Well, by making a change here, we’ll save, you know, $10,000 in this account.” “That doesn’t seem like a lot when we’re spending millions.” “Well, hold on a second. You’ll save $10,000 per account, and you have 500 accounts,” or, “You have 1000 accounts,” or something like that.

Or almost cost avoidance of this cost is growing unbounded in all of your accounts. It’s tiny right now. So, like, now would be the time you want to do something with it. But like, again, for a lot of companies that have adopted the practice of endless Amazon accounts, they’ve almost gone, like, it’s the classic, like, you know, I’ve got 8000 GitHub repositories for my source code. Like, that feels just as bad as having
one GitHub repository for your repo. I don’t know what the balance is there, but anytime these different types of services come out, it feels like, “Oh, wow. Like, I’m going to get nickeled and dimed for it.”

Corey: This ties into the re:Post launch, which is a rebranding of their forums, where, okay, great, it was a little crufty and it need modernize, but it still ties your identity to an IAM account, or the root email address for an Amazon account, which is great. This is completely worthless because as soon as I change jobs, I lose my identity, my history, the rest, on this forum. I’m not using it. It shows that there’s a lack of awareness that everyone is going to have multiple accounts with which they interact, and that people are going to deal with the platform longer than any individual account will. It’s just a continual swing and a miss on things like that.

And it gets back to the billing question of, “Okay. When I spin up an account, do I want them to just continue billing me—because don’t turn this off; this is important—or do I want there to be a hard boundary where if you’re about to charge me, turn it off. Turn off the thing that’s about to cost me money.” And people hem and haw like this is an insurmountable problem, but I think the way to solve it is, let me specify that intent when I provision the account. Where it’s, “This is a production account for a bank. I really don’t want you turning it off.” Versus, “I’m a student learner who thinks that a Managed NAT Gateway might be a good thing. Yeah, I want you to turn off my demo Hello World app that will teach me what’s going on, rather than surprising me with a five-figure bill at the end of the month.”

Pete: Yeah. It shouldn’t be that hard. I mean, but again, I guess everything’s hard at scale.

Corey: Oh, yeah. Oh yeah.

Pete: But still, I feel like every time I log into Cost Explorer and I look at—and this is years it’s still not fixed. Not that it’s even possible to fix—but on the first day of the month, you look at Cost Explorer, and look at what Amazon is estimating your monthly bill is going to be. It’s like because of your, you know—

Corey: Your support fees, and your RI purchases, and savings plans purchases.

Pete: [laugh]. All those things happened, right? First of the month, and it’s like, yeah, “Your bill’s going to be $800,000 this year.” And it’s like, “Shouldn’t be, like, $1,000?” Like, you know, it’s the little things like that, that always—

Corey: The one-off charges, like, “Oh, your Route 53 zone,” and all the stuff that gets charged on a monthly cadence, which fine, whatever. I
mean, I’m okay with it, but it’s also the, like, be careful when that happen—I feel like there’s a way to make that user experience less jarring.

Pete: Yeah because that problem—I mean, in my scenario, companies that I’ve worked at, there’s been multiple times that a non-technical person will look at that data and go into immediate freakout mode, right? And that’s never something that you want to have happen because now that’s just adding a lot of stress and anxiety into a company that is—with inaccurate data. Like, the data—like, the answer you’re giving someone is just wrong. Perhaps you shouldn’t even give it to them if it’s that wrong. [laugh].

Corey: Yeah, I’m looking forward to seeing what happens this coming year. We’re already seeing promising stuff. They—give people a timeline on how long in advance these things record—late last night, AWS released a new console experience. When you log into the AWS console now, there’s a new beta thing. And I gave it some grief on Twitter because I’m still me, but like the direction it’s going. It lets you customize your view with widgets and whatnot.

And until they start selling widgets on marketplace or having sponsored widgets, you can’t remove I like it, which is no guarantee at some point. But it shows things like, I can move the cost stuff, I can move the outage stuff up around, I can have the things that are going on in my account—but who I am means I can shift this around. If I’m a finance manager, cool. I can remove all the stuff that’s like, “Hey, you want to get started spinning up an EC2 instance?” “Absolutely not. Do I want to get told, like, how to get certified? Probably not. Do I want to know what the current bill is and whether—and my list of favorites that I’ve pinned, whatever services there? Yeah, absolutely do.” This is starting to get there.

Pete: Yeah, I wonder if it really is a way to start almost hedging on organizations having a wider group of people accessing AWS. I mean, in previous companies, I absolutely gave access to the console for tools like QuickSight, for tools like Athena, for the DataBrew stuff, the Glue DataBrew. Giving, you know, non-technical people access to be able to do these, like, you know, UI ETL tasks, you know, a wider group of a company is getting access into Amazon. So, I think anything that Amazon does to improve that experience for, you know, the non-SREs, like the people who would traditionally log in, like, that is an investment definitely worth making.

Corey: “Well, what could non-engineering types possibly be doing in the AWS console?” “I don’t know, jackhole, maybe paying the bill? Just a thought here.” It’s the, there are people who look at these things from a variety of different places, and you have such sprawl in the AWS world that there are different personas by a landslide. If I’m building Twitter for Pets, you probably don’t want to be pitching your mainframe migration services to me the same way that you would if I were a 200-year-old insurance company.

Pete: Yeah, exactly. And the number of those products are going to grow, the number of personas are going to grow, and, yeah, they’ll have to do something that they want to actually, you know, maintain that experience so that every person can have, kind of, the experience that they want, and not be distracted, you know? “Oh, what’s this? Let me go test this out.” And it’s like, you know, one-time charge for $10,000 because, like, that’s how it’s charged. You know, that’s not an experience that people like.

Corey: No. They really don’t. Pete, I want to thank you for spending the time to chat with me again, as is our tradition. I’m hoping we can do it in person this year, when we go at the end of 2022, to re:Invent again. Or that no one goes in person. But this hybrid nonsense is for the birds.

Pete: Yeah. I very much would love to get back to another one, and yeah, like, I think there could be an interesting kind of merging here of our annual re:Invent recap slash live brunch, you know, stream you know, hot takes after a long week. [laugh].

Corey: Oh, yeah. The real way that you know that it’s a good joke is when one of us says something, the other one sprays scrambled eggs out of their nose. Yeah, that’s the way to do it.

Pete: Exactly. Exactly.

Corey: Pete, thank you so much. If people want to learn more about what you’re up to—hopefully, you know, come back. We miss you, but you’re unaffiliated, you’re a startup advisor. Where can people find you to learn more, if they for some unforgivable reason don’t know who or what a Pete Cheslock is?

Pete: Yeah. I think the easiest place to find me is always on Twitter. I’m just at @petecheslock. My DMs are always open and I’m always down to expand my network and chat with folks.

And yeah, right, now, I’m just, as I jokingly say, professionally unaffiliated. I do some startup advisory work and have been largely just kind of—honestly checking out the state of the economy. Like, there’s a lot of really interesting companies out there, and some interesting problems to solve. And, you know, trying to spend some of my time learning more about what companies are up to nowadays. So yeah, if you got some interesting problems, you know, you can follow my Twitter or go to LinkedIn if you want some great, you know, business hot takes about, you know, shitposting basically.

Corey: Same thing. Pete, thanks so much for joining me, I appreciate it.

Pete: Thanks for having me.

Corey: Pete Cheslock, startup advisor, professionally unaffiliated, and recurring re:Invent analyst pal of mine. I’m Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you’ve enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you’ve hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment calling me a jackass because do I know how long it took you personally to price CloudWatch RUM?

Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Announcer: This has been a HumblePod production. Stay humble.

Join our newsletter

checkmark Got it. You're on the list!
Want to sponsor the podcast? Send me an email.

2021 Duckbill Group, LLC