Exploring Advanced Cybersecurity with Michael Isbitski
SITC-536-Michael Isbitski
===
[00:00:00] Michael: Sometimes the language I’ll use that’s kind of become common language, certainly with security practitioners, it’s kind of like security through obscurity.
[00:00:10] Corey: Welcome to Screaming in the Cloud. I’m Corey Quinn. This promoted guest episode is brought to us by our friends at Sysdig. And Mike Isbitski is our guest today. He’s the director of cybersecurity strategy. Welcome back, Mike. How have you been?
[00:00:26] Michael: I’ve been great, Corey. It’s been a year. [laugh] A lot of things have happened. But yes, it’s really great to be here and talk with you again [laugh] .
[00:00:33] Corey: [Worst] when the days are short, but the years are long, and it’s always this really weird approach to, just, passing of time. It’s 2024 now, and I’ve talked about things like oh, yeah, five years ago—like, oh, that wasn’t five years ago. It was 2019. Oh, no.
[00:00:47] Michael: Time moves fast, especially with little ones.
[00:00:49] Corey: Dear Lord. Teaching them to talk was the greatest mistake. So, it is 2024, and that can mean only one thing: the [“Sysdig 2024 Cloud-Native Security and Usage Report” 00:01:27] . So, much to tear into, let’s start at the beginning with titles. Cloud-native, what does it mean?
[00:01:06] Michael: Oh that, it’s like the proverbial question, right, and without getting too philosophical, I mean, where I usually start with this—and you know, even from my analyst days, it was container focused, right, or container—containers are kind of at the heart of the thing. The name is almost misleading, right? It doesn’t necessarily need to be cloud-hosted, but the technologies that power cloud are part of the equation, so typically, that’s containerized services.
[00:01:30] Corey: But then you get into the whole serverless approaches, event-driven architectures. That counts it as well.
[00:01:34] Michael: Yep.
[00:01:34] Corey: Increasingly—and I’m cynical here; I’m not targeting you, folks directly, but I’m not exempting you either because I think everyone does it now—increasingly, cloud-native is used to mean ‘things I can sell you.’ It’s more of a branding and positioning thing, whereas, like, oh, we have an appliance that we’ve turned to an AMI. It’s cloud-native. It’s like, “Mmm, is it, though?”
[00:01:56] Michael: Yeah.
[00:01:57] Corey: It’s a bone that I’m picking, but it’s also a war that I’d lost, similar to the term ThinOps. I think that’s a bad term, but everyone’s using it at this point, and I’m not going to be, like, the last person raging against the tide, trying to defend my sandcastle.
[00:02:11] Michael: Yeah, it’s—I mean, I was seeing some of that certainly from vendors. Like, they would sell a cloud service, it wasn’t actually fully cloud-enabled. Yeah, they’d kind of give you a virtual machine that could operate in the cloud, so is it taking full advantage of those, you know, native—to use the term, right, that we’re trying to define—but those things that power the clouds that Amazon and Microsoft and Google are using, that is essentially it, right? So, when you dig in a few layers, it’s absolutely containers, Kubernetes, serverless, event-driven architectures—
[00:02:43] Corey: Turtles all the way down.
[00:02:45] Michael: Oh, yeah. Lots of complexity, right? Lots of things to break.
[00:02:49] Corey: So, but at some level, this is the 2024 report, and you can compare that to the 2023 report. And you know, at some level, it’s okay, I think that the SQL might not have done as much service as I was hoping, there’s not a whole lot of new plot development, I find the protagonist still relatively unrelatable, but what have you noticed that are big shifts that have happened in a year? We talk about days short, years being long. What’s changed in 12 months?
[00:03:14] Michael: I’d say some of our customers, actually a good percentage of them, got better at vulnerability management. Last year, we talked a lot about the concept of runtime insight and providing runtime context to make better risk informed decisions, right? Ideally, you’re doing all kinds of testing early and often in the lifecycle of an application or the workloads that support it. You might be finding instances of vulnerabilities that should be fixed prior to deploying that production, and then you fix that thing. But you know, typically, what organizations run into is, as they start scanning, they find more and more issues, right? Commonly, that’s a [CVE ID] , right, and then they start looking at CVSS scoring, and what’s the criticality of that particular vulnerability, but you still have a mountain of things that you just don’t know what to fix, right? So—
[00:04:01] Corey: It’s the Dependabot problem. I’ve been ranting about this for a while now, where I wind up—it feels like it’s the human Nessus scan at this point, only not human in that particular case. Because, oh, here’s a whole bunch of critical problems you need to fix. And is it, though? It’s in a library that my application depends upon. That application is purely internal, will only ever see sanitized inputs, and has a defined output. I don’t actually care about this, but that noise obscures things that actually do pose some risk—theoretically—to a bunch of things.
[00:04:35] I am, to be direct, a little less security concerned with the stuff that I’m talking about because it tends to be stuff that works on the back-end to help my newsletter get written. The worst-case compromise story here, it’s a completely distinct system from anything that has recipient email addresses, so all someone’s going to be able to do is breach the things I’m going to be writing about in four more days, or, worst case, insert garbage into my queue, which would be annoying for me, but has no real impact. It’s juxtaposing risk versus consequences on it. I could spend weeks of my time and significantly complicate my entire flow and structure to lock a lot of those things down more than they are, but there’s no real reason to. It winds up impeding usability, and there’s no business value to me in doing that.
[00:05:22] Whereas I’ve when I say—I’ve said this about things like the customer data that I’m entrusted with for my consulting projects, if I were to say that, I would very quickly not have customers anymore. We have built that in some of the best ways that we know how, and then—and I think people miss this step—we brought in security experts to look at it and, so we’re not in a situation of violating Schneier’s Law, who says that—Bruce Schneier is a big fan of saying that anyone can devise a cryptographic algorithm that they themselves cannot break. Security is kind of the same way, so bring in people who are sneakier than I am and do this stuff a lot, and I think I’m checking the boxes and doing what I should be doing, but you’re the expert. Tell me if I’m wrong. There’s value in that.
[00:06:04] Michael: Yeah. And I’m glad you brought up the point, Corey. So, it’s absolutely true, right? It’s like you have to start thinking about the business context. Sure, that vulnerability might be critical as rated by MITRE, but is that mitigated with some other control in my enterprise, right, so businesses really are looking at the complete picture. They can’t fix everything.
[00:06:24] And you kind of hit the nail on the head. Not everything’s in use, right, so why even chase those vulnerabilities? So, that’s the reality of modern vulnerability management: you’re not going to address everything. But a lot of what’s also changed—kind of, stepping outside of just the Sysdig report—is kind of the regulatory landscape. So, we saw the finalization of the SEC cybersecurity disclosure requirements, and it very much talks about some of the things you were alluding to.
[00:06:49] It’s, you know, well, what’s the—to use their language—material impact, right, which is going to be your business impact. So, if you were a publicly traded or doing business with publicly-traded entities, well, you might have an incident that results from that, and it’s not, kind of, in the terms that a practitioner or security practitioner might think of, right, like, they often just kind of jump right to data breach, it’s, is there some kind of business impact to you, right? So, you talked about customer names, like, loss of that, kind of, intellectual property, right, but it’s probably also governed data type.
[00:07:20] Corey: Well, and even now, let’s be serious for a second here, right? My business fixes large AWS bills for companies. The data that I have involves how much money big companies pay a giant company, and if that gets leaked, I’ll get sued into oblivion, I will have lost the trust of my customers, and that’s important, but nobody dies. There is no—like, people view these things as confidential information, sure, but it’s not, in almost any case, trade secret level of information that’s going to materially impact their ability to do business. It might cause a scandalous news cycle, and there’s not going to be a whole lot of real downstream impact from that.
[00:07:56] Contrast that to the scariest job I ever had, doing operation stuff briefly at Grindr, where if we got breached, people could die. Everything compared to that, this is just money, and corporate money at that. I sleep very well at night as a result. I intentionally don’t want corporate secrets of massive import. I’m not running a password manager here; I’m fixing large company cloud bills. One of those has a very different risk profile from the other.
[00:08:22] Michael: Yeah, it hits on a very good point, right? You kind of talked about safety, you know, and within the umbrella of cybersecurity regulation, there’s the concept of resiliency, but that information would be materially impacting, right? If you have financial numbers about whatever publicly traded entity, it could impact the decisions of a reasonably informed investor, right? And that’s kind of the SEC’s charter. They need to protect them.
[00:08:50] Corey: Yes and no. It’s a fair point. The counterargument is if you take some random publicly-traded company that’s worth, I don’t know, a market cap of 20 billion, let’s say, for the sake of argument, and you find that, at a year, they’re spending $20 million on AWS versus $120 million on AWS, that is a six-fold difference. In either scenario, though, it’s just a data point in isolation. Does that tell you anything about their business? Given that I made them up, it’s well, that could be—either one could be a problem, depending, or it could be great, but just with that data point, there’s nowhere near enough information to figure that out. It would be a piece of a puzzle; it’s not the answer to anything.
[00:09:29] Michael: Yeah, and that’s kind of where—how do you gauge materiality, which is probably a conversation for a separate podcast [laugh] , but it’s absolutely true, right? And it shifts, right? It becomes… a dynamic thing in context of other events, right? So like, those numbers in isolation, that’s fine, you might not be able to correlate the that to some other activity, but then if you now map that to these companies’ cybersecurity expenditure, and making kind of assessments about their security posture, well, that could kind of lead into that material impact that would need to be disclosed. So yeah, it kind of becomes quite a bit of a quagmire by going back to, kind of, vulnerabilities that would lead to that, it’s kind of that sequence of things that could lead up, right? If you know, you have known vulnerabilities, you should be addressing them, certainly, if it’s in use in runtime.
[00:10:25] Corey: Yeah. One thing that your report does not touch on—or at least, if it did, I did not see it when I went through it—and this has been the case forever, but a long time ago, the internet collectively decided, for a variety of reasons and pressures and constraints, that the cornerstone of your entire online identity is your email inbox. So, people talk about compromising credentials out of environments and the rest, and at exploiting software vulnerabilities. There’s a story of if I can get access to your email, I can become you, and that is something that most people like to hand-wave away. So, when I talk to companies that are adamant about enabling multifactor auth whenever they’re getting into their cloud account, that’s great.
[00:11:06] And then you dig, and it turns out they don’t do that for their email. It’s, what are you doing there? It’s like, everyone trying to try to massively remove single points of failure from their environment in the cloud, but not the credit card that’s the payment instruments, and goes to an email address that the founder never checks anymore. So, there’s always a question of, have you thought through these things to their logical conclusions? And that is something that I don’t [unintelligible] necessarily fix. It’s certainly out of scope for what Sysdig does, but the stuff that’s in scope, the things that you turned up in this report, a couple things I found surprising.
[00:11:41] Michael: No, I was just going to kind of validate, right, kind of the landscape of all your identities and where you assign access controls. But we get into, like, consumption patterns and the things that Sysdig might see. But yeah, go ahead. There was a second part to your question.
[00:11:55] Corey: Yeah. The way that I think about these things comes from my background when I was tinkering around with these things, many moons ago, where when we talked about servers are having distinct identities, virtualization was treated with suspicion, containers, I would say they don’t exist, but containers are from the 70s. LPARS on mainframes, and they’ve been iterating steadily ever since. But containers are not a thing in the same way. Even when I talk about email, for example, I talk about long-term sustained, persistent attacks. The world doesn’t always work that way. A lot of jobs are very short-lived, ephemeral containers, and one of the key findings that you have in your report is that ephemeral, short-lived containers do not help from a security perspective, or if they do, not nearly as much as people believe that they do.
[00:12:41] Michael: Yeah, it’s an interesting one. I sometimes gloss over this, probably because they’ve just been looking at the technology too long. It’s kind of like you except [laugh] that’s not… it’s not necessarily, like, a security boundary. Sometimes the language I’ll use that’s kind of become common language, certainly with security practitioners, it’s kind of like security through obscurity, right? So, you’re abstracting the operating system from the workload, much like you’re abstracting the hardware from the operating system, when you’re talking about virtualization.
[00:13:08] So, you’re kind of [sigh] adding complexity, really, but you’re kind of inhibiting visibility. But it’s another layer of abstraction when you start talking about containers. So, this is actually a big one with visibility, right, which becomes part of, well, how do I actually determine if an incident is relevant, right? Did it create security risk for my organization, and does it have material impact? But it’s often where organizations aren’t prepared, right? Even if they have the engineering staff that they’re looking at containers, they’re not thinking about, you know, that full sequence of events.
[00:13:41] Like, if somebody does attack a containerized service, am I getting all the right telemetry? And you raised the point about lifetime of the container, the ephemerality. It’s only alive for five minutes. Did I actually retain all those signals that now I can kind of stitch together the events, and quickly enough, right, because there’s also the whole concept of real time, right, or as close to that as you can get in cloud. And we could certainly go down that path, but, you know, the regulations are very clear on this.
[00:14:13] Like, you need to kind of be in that spot that you can detect incidents in real time, so you can very quickly make that determination around materiality, and then do I have to disclose this to whatever governing body, right? We talked about the SEC, but you know, it might be in the EU, and you’re bound by the NIS2 Directive, so that needs to be disclosed to those appropriate entities. So, gathering the right telemetry throughout the tech stack becomes incredibly important. Certainly Sysdig customers are prepared for that because like, that’s kind of what we call out, right? They’re very mature in their threat detection and response. They often seek out Sysdig for that reason, right? I need to have that visibility within the containerized architecture, so I can do all these things that inform my risk management and vulnerability management.
[00:14:58] Corey: Well, hang on a second. I’m going to challenge you on that because historically, the problem you have with a lot of those telemetry solutions—and again, I’ve not gone deep into the weeds at a scaled-out system with Sysdig; I don’t run scaled-out systems anymore. Everything I do tends to be very small. It’s like, is it really a big data problem if it fits in RAM on a single machine? It’s that type of problems here.
[00:15:17] But there’s always been a challenge where even with some of the small-scale experiments I’m doing, it’s, wow, those logs never shut up for a second, kind of like me on Twitter. So, you wind up with a needle in a haystack problem. And in some cases, I’ve talked to people—who of course, I will not name—who have taken a position privately that, frankly, if you record a lot of his data, you’re never going to be able to go through it, so all you’re really doing is providing the smoking gun evidence later, so you can get yelled at for see if you’d been looking through this, you would have seen the signs of the breach. So, I don’t keep it, so I don’t get fired. I don’t think that that’s a terrific approach to take, but it is something that I see people talking about over drinks.
[00:15:59] Michael: Yeah, it’s absolutely true, right, and I’d say every cloud provider will say publicly that they give you the telemetry, although we’ve seen very specific, kind of, language from governing bodies and nation-states that cloud providers might need to do more and offer more, right? You shouldn’t have to pay more for detailed logs. But you absolutely hit the nail on the head, right? Just the presence of that log data isn’t enough; you also need to be doing things with it. I think you spoke with my colleague Anna Belak a few weeks ago on the 5/5/5 benchmark, and there’s a very specific component to that, which is about correlating these signals, right? So yeah, you have to pick up that needle in the haystack—or the sets of the needles, right—correlate them very quickly, so now you can actually determine what the appropriate response is for that incident.
[00:16:50] Corey: Yeah, there are a whole bunch of mess stories on these things. How do you handle the analysis of it? How do you find the signal from noise? It’s, great. Congratulations. Your job as an intern now is to read all the logs as they come in with a tail-f. It’s like, your best friend is grep to start, but where do you go beyond that?
[00:17:07] Michael: And then you find out somebody didn’t actually turn [up] the log for that given service.
[00:17:11] Corey: Yeah. And I’m talking about old-school systems that are around for calendar time. One of the big changes we found with ephemeral infrastructure is—even ignoring the security piece for a second, as so many people do for way more than a second—is, if you have a problem in an environment that’s an ephemeral container, and you got alerted about it, but that container stopped existing 20 minutes ago, how do you reconstruct what happened? How do you diagnose it? And that required—that is where observability fundamentally came from is, you need to make sure that you have telemetry and an understanding of these distributed systems in order to track this.
[00:17:43] Michael: Yeah, it’s kind of a very different way of looking at it, and it’s kind of hard to wrap your brain around until you actually, like, start tinkering with containerized services. So, by all means, set it up on your own Raspberry Pi, or spin it up to the cloud provider of choice, but expect [laugh] expect the trial credits to exhaust very quickly, right, because if you’re going to spin up a Kubernetes cluster—
[00:18:04] Corey: Funny you say that. Last week, I built myself a [Kubernety] . Not multiple, just the one. It’s a single controller cluster that I have running in the spare room on a bunch of Raspberries Pi. And someone asked, “Well, why didn’t just run this on EKS?” It’s because I do this for a living, and I’m still worried about the surprise billing aspect potential if something goes into a crash loop or I wind up not paying close enough attention. Like, “Well, don’t you understand how the bills work better than most?” Like, “I’d say better than nearly everyone, and that’s why I’m scared of it for my own personal side projects.”
[00:18:36] Michael: It’s kind of surprising. Like, even—I mean, you said it, right? You could spin up a very bare-bones cluster, and it… it’s been a bit since I’ve done it, but I would say at that time, logging was disabled, right because as soon as you turn up logging, you’re going to be generating that much more service traffic within the cluster, but then you have data storage needs that might not be allocated to that trial account. So, maybe two or three days, and then you’re going to hit the wall on your trial. But um, yeah, they are very chatty, right, to borrow that network engineering terminology, right?
[00:19:12] Logs fill up very fast, and when you’re talking about abstracted services, containers, running in virtualized machines running on hardware—which you’re not going to see, right; AWS or Azure would kind of see that end of the equation, but you are you’re seeing the other pieces of the infrastructure. And it’s very challenging to know what specifically is going on, right? And then you don’t want to use the blunt hammer approach and just shut down an entire VM that might be running a thousand containers, right? And then what’s the production impact? Who knows?
[00:19:40] So yeah, that visibility or having the appropriate signals to know what the heck’s going on, I guess, there’s a little bit of overconfidence, right? Well, we have our trails turned up, we’re gathering everything, we got audited sufficiently—or we satisfied the audits for that, but is it providing the right container context—if you are doing kind of cloud-native development—to circle back on [laugh] where we started the conversation. Which many organizations are, right? Maybe not for the entire app portfolio, but certainly some things that they need to iterate quickly on.
[00:20:15] Corey: There’s other stuff that was in the report that I found interesting as well. You talk about AI adoption growing slowly. And that’s—in isolation, that’s a fascinating statement because everyone’s talking about it, it’s massively hyped, I’ve been doing a bunch of work with it myself, and my bill for all of that comes into less than $30 a month because these are all onesie-twosie experiments, a lot of interactive exploration. And I talked to my clients, and I see that across the board as well. Everyone’s doing something with it, but I have not talked to anyone yet who said, “Well, it’s time for contract renewal, and we’re forecasting $100 million over the contract period, but we’re doing some stuff with generative AI, so let’s make it 150.” No one is saying that. It is not a driver of spend in any meaningful way, yet. It’s a driver of intense interest, but that is not the same thing.
[00:21:02] Michael: Yep. And there’s different consumption patterns, right? [I started] to get at this, like, a lot of buzz was caused by OpenAI with the launch of ChatGPT—
[00:21:11] Corey: Oh, it’s a magic box from the future. It absolutely—there is value there. I don’t want people to think this is another blockchain, where people have been hyping it up for ten years, but there’s no ‘there’ there.
[00:21:21] Michael: Although there’s AI blockchains.
[00:21:23] Corey: Of course, there are AI blockchains. I prefer the original blockchain. I call it Git. And someone’s going to say, “Well, what about B trees?” And I’m going to tell them to go back to computer science class. But that’s a separate problem.
[00:21:32] Michael: Yeah. I’m kind of in the same boat, right? If I’ve run a lot of, kind of, tinkering on the side. You know, instead of just a generic Google search, it’s, you know, you can enroll in the Google Labs, and then you get kind of analyzed search results, which is fun, right? It kind of saves you—it’s fun, but it’s also kind of productive.
[00:21:50] It saves you from running sequences of searches, right? It’s going to stitch together the information for you. So, it’s doing things that I’d say, as humans, we’re just not used to having machines that can do that work for us. Like, it tended to be, give the computer a very simple problem, it gives an answer. Now, I got a piece together everything because the machine is dumb. So, generative AI is kind of pulling those things together. I mean, this is a very simplified explanation, but it’s, kind of, the general use case. And it can be used in a lot of different ways, like in healthcare, maybe it’s going to give you a very specific, or close to specific guidance for your specific situation.
[00:22:25] Corey: Well, like, the counterargument to this is—and they all struggle with this—is, honestly, they have the same flaw as a lot of people that I’ve known in the course of my life, which is, if they don’t know the answer, they’re loath to admit it.
[00:22:36] Michael: Yeah. And I’d say maybe a little bit of recency bias here, right, because you’re like, well, what’s the currency of the data? How current is that data? What hasn’t been trained on?
[00:22:47] Corey: There’s that. There’s, in many cases, it will extrapolate and go from there. One of the big consultancies recently said, they’re using generative AI to set their corporate strategy, which is not a statement I would ever have made because it’s—what you’re saying is basically, this magic parrot that is a spectacularly good bullshit generator in many respects—and the world runs on bullshit in bunch of approaches, but that is that something you really want to use? I mean, I use it myself for a variety of creative endeavors. For example, if I’ve written a blog post, I will sometimes slap that whole blog post into generative AI and say, ‘give me ten title options for this.’ And usually, it’s like, okay, number four is pretty good. Let me see if I can punch that out myself.
[00:23:26] It’s a ‘Yes And’ improv writers’-room style approach, and using it in that respect, yeah, it’s someone to talk to—provided you can trust the sanctity of the data—but it can’t make decisions inherently, and if you start trusting it to do that, that becomes dangerous. I’ve seen—I’m sure you’ve seen, too, the Cambrian explosion of inbound cold emails that are clearly written via AI because as soon as you give it more than a glance, like, this makes absolutely zero sense whatsoever, but it is written in a somewhat compelling way. Like, I would never in a million years send out something at this stage of the game that generative AI had written to outside people without humans reviewing it first. So, all these chatbots on websites? No, thank you. But that’s not the space that you play in. What are you seeing with AI adoption?
[00:24:17] Michael: Yeah, so it’s—we broke out the statistics, right? And it’s—you have to remember we’re kind of seeing how people are engineering their applications and the infrastructure, so what are the specific workloads in their cloud instances—or on-prem, right, if maybe their hybrid deployment, which are many customers—but um… what are the specific packages used to power AI, right? So, I’m not saying OpenAI is a customer—I’m actually not even sure if they are—but what are they using to power of their infrastructure? What are the specific packages they use to power the large language model that powers the GenAI, right, and then there’s kind of image recognition libraries as well, right when you start talking about the visual side of OpenAI’s tools. So, kind of going back to packages, right, we’ve talked about this a lot at Sysdig, but there’s kind of that— [laugh] there are always going to be latent vulnerabilities in software packages, it doesn’t matter what your use case is. You know, right now, we’re talking about AI, but there’s a lot of libraries that will be reused to power AIs.
[00:25:24] So, you’re seeing some awareness now around common attack vectors against AI. I want to say at least one of them has called out kind of the vulnerable libraries that power that. You know, [unintelligible] is very notorious for at least including one of those as their top ten. I can’t remember [unintelligible] AI top ten. But um, if you have latent vulnerabilities in the packages that power your infrastructure and applications, well then, it inherits those vulnerabilities, so it becomes yet another attack vector.
[00:25:53] So, for Sysdig, what we were seeing is kind of the adoption of those NLP engines, natural language processing, large language models, generative AI packages. Like, OpenAI has one that you would use to connect to OpenAI’s platform, so what are we seeing amongst customers where they adopt that? And I believe Sysdig was roughly 15%, right, which is, like, well, you know, you’re hearing a lot more about it in media; why aren’t we seeing more? And it’s kind of, you have to consider enterprise use cases, and regulatory restrictions, right, and privacy laws. Like, how much data do you actually want to offer an AI engine? Is that going to be OpenAI’s, right? That could be intellectual property or sensitive data.
[00:26:35] Corey: As seen through studies that also indicate that internal data is less valuable to AI than people would suspect. They’ve done comparisons of general purpose AI versus the highly trained on internal data, and in—what I was reading, it said that it was m—it was highly—the general stuff outperformed massively, how this stuff was going to work. When they wound up announcing Copilot for—it could be trained on your internal code stores, my snide comment was, “Oh, great. They’ve taken a very clever robot and found a new way to give it brain damage by exposing it to your corporate code.” And it’s not inherently that wrong. It’s—
[00:27:12] Michael: Well, Microsoft already [unintelligible] , right? They [unintelligible] with the GitHub acquisition [laugh] . So—
[00:27:16] Corey: But the terrifying part is, if you train it on your internal data, and it starts suddenly spitting that out verbatim to your competitors, that’s a problem.
[00:27:22] Michael: Yeah, maybe it’s wrong.
[00:27:23] Corey: Yeah. At some level, it’s like—honestly, it’s like, “So, what happens if a competitor comes in and hires a bunch of our people?” It’s like, “Well, if I give you a subset of people, I honestly think we should just suggest it to them. I think that would be the meanest thing we could do to them because”—yeah, that’s unkind of me, but there are moments there where it’s, everyone has to find their own path, and I think that stealing from competition or appropriating stuff that isn’t yours—which is a nice form of saying stealing—doesn’t get you as far as you’d think. So in the moment, I’m curious to know, anything else that jumped out at you from this version of the report? There’s a lot of good stuff in here. It’s very well done. It’s beautiful. As always, good work. It inspires me of what some of my writing could look like, if I worked [full-bore] close with a graphic designer.
[00:28:06] Michael: Yes. Well, there’s many minds that contribute to it. It is a very lengthy effort, and I should give credit where it’s due. Crystal Morin did a lot of work on this report, the, kind of, primary authorship. But yeah, based on data, and anonymizing it, and then how do we surface those signals.
[00:28:24] Corey: Oh, and I want to call out as well—because as people know, they can buy my attention, but not my opinion—one of the things I’ve loved about this report, and I can’t say about all reports that companies put out, it’s not a thinly veiled sales pitch. It talks about the things that your company solves for because that’s where you spend your time, energy, focus, thinking, et cetera, but it’s not a collection of FUD, and the takeaway from the report is not, “Holy crap, I need to go buy Sysdig products now.” Sorry, if that was some marketers plan, but that’s not the approach. What it does take—what I do take away from it—sincerely—is a sense that you folks keep your finger on the pulse of this stuff, you know, what’s going on, I am smarter now for having read this, and I recommend other people do, too. It is not going to be a waste of your time. Now, that said, what else is in the report that jumps out at you and got your attention?
[00:29:14] Michael: I was going to bring up a little bit of FUD, but it’s, you know, it’s kind of the reality, and we talk about this in the report, right, it’s kind of the state of permissioning, right, or cloud entitlements. It’s still not trending in the right way. We’re just not seeing organizations really scrutinize what are the appropriate permissions, and it’s not just, you know, your employees or consumers that are in the cloud instance; it’s also those services or machine identities because they too also need to authenticate and authorize.
[00:29:42] Corey: Yes. That is not FUD. That is the grim reality of it. And I blame the cloud providers. Because I’ve done this myself. I had something that CodeBuild would build and deploy and get stuff set up there, and it’s still permissioned with admin access years later because I tried my darnedest to do the responsible thing, and it kept coming back with, “Nope, can’t do it that way.” And after six iterations, I’m, “The hell was it. To-do: fix this.” And that to-do has been there for seven years. It’s become load bearing at this point.
[00:30:10] Michael: Yeah, and that’s the reality, right? It’s kind of like we preach about least privilege—and you could say, zero trust now—but it’s very hard to do in practice.
[00:30:18] Corey: And the incentives are all wrong. If I over-scope permissions, it might cause me problems in the future, but if I don’t over-scope the permissions, the entire thing does not work, and I have a problem right now in front of me. So, I don’t castigate people for making those decisions. We all know it’s not the right path, it’s a shortcut, but it is so challenging to do it the right way that people don’t. And yelling at people to be smarter does not work when it comes to security. It has not worked for 50 years, so what are we going to do instead?
[00:30:50] Michael: Yeah, and it’s tricky, too, right, because the guidance tends to be, “Well, enable 2FA on your accounts.” It’s like, well, maybe you could do that for your employees. Would you do that for your customers? It depends on how receptive they are to that. Like, maybe they’re not that tech-savvy; they don’t want to [laugh] deal with a mobile authenticator.
[00:31:07] Corey: And a lot of this stuff is one system talking to another. Like, “Okay, great. I’m going to plug a Yubikey into it”—
[00:31:10] Michael: Exactly.
[00:31:11] Corey: —“And then build a robot to push the button on a”—like, that’s just—that’s just—that’s extra steps.
[00:31:16] Michael: Right. And that’s kind of what I was getting to, right? You can’t really do multifactor authentication the same way you would for machine identity. And you’re talking about automation, right? It’s like, well, nobody’s going to sit there, turn a key, this service communication is blessed, allow it— [laugh] and plug in the Yubikey, and then off it goes, right? You’re talking about things like certificate-based authentication, or maybe it's behavior-based. Like, how is that service talking? Does it have a pattern of doing that, or it has it suddenly changed?
[00:31:47] So, it’s much more nuanced and way more complex. You know, we’re starting to see some regulatory language clarify what’s needed, right, and talk more about the service identities pieces of this, but it’s an area that a lot of companies—it’s tough, right? You hit the nail on the head. They’re going to get the thing operational, and then maybe it’s on their list of to-do, right? We’ve seen [laugh] that some—not to name names, but we’re seeing that and some complaints about weak access controls. But this is how it happens, right? It’s very easy to stand up the system, get it running. It’s a lot harder to, kind of, do that in the most secure way possible when you think about all of your identities and how they communicate.
[00:32:26] Corey: There are no easy answers for this, and you wind up with a OIDC and being able to pass information back and forth between providers, some of which don’t support it, so it’s, oh, just build long-term credentials and bake it into our system. [grunts] .
[00:32:39] Michael: Yeah, exactly. It’s the hard coded password, right? It’s turtles all the way down.
[00:32:44] Corey: One of the reasons it’s easier to stay single-cloud is because at least there in theory, you wind up with everything being aware of everything else it can talk to you, whereas you got to do a lot of extra work to integrate disparate services.
[00:32:54] Michael: Yeah.
[00:32:54] Corey: Though with AWS, it seems like some of those services don’t talk to each other very well, either. But that’s a separate problem.
[00:32:59] Michael: Ri—exactly. Yeah. Does the service within the cloud even work with the secrets management solution? Which AWS does have one, but are all their services integrated fully? And then if you’re custom-building something, does it also communicate to AWS Secrets Manager properly? It’s a question mark, for sure.
[00:33:16] Corey: Yeah. I wish that there was a… I wish there was a better future. I want to thank you for taking the time to speak with me. And if people want to grab their own copy of the report, they can grab their own at www. sysdig. com slash S- I- T- C.
[00:33:30] Thank you so much for taking the time. Last question before we wind up wrapping it. What do you anticipate changing when we have this conversation a whole year away—which will go by in an eye-blink—in 2025?
[00:33:43] Michael: Yeah, very good question, Corey. You know, I would say the AI trends will increase, right? I’d probably see some more adoption of packages as organizations experiment internally. Like, how can this support an enterprise use case and do that securely? I don’t know if the identity piece will be fully addressed, but maybe we’ll be more aware of it. Maybe we’ll see those numbers trend a little better, but there’s going to be no shortage of, kind of, regulatory oversight of that, right?
[00:34:09] There’s a lot of pieces that are going to start kicking in, certainly in this year, with [using] NIS2 Directive. But as things like the national cybersecurity strategy in the US get fleshed out and all the supporting acts, these things are going to become incredibly important. So, keep your eye on those things, make sure you’re securing, right? It’s packages at the end of the day, it’s containers, it’s the things we know, you know, as engineers and security leaders. So, the fact that it’s an AI magic box doesn’t mean that the technology fundamentals changed or the security principles changed. It just means we need to make sure we’re doing it in more places.
[00:34:41] Corey: I’m going to come back in a year, and we’ll find out. I’m looking forward to seeing how that bears out. I mean, go back two years ago, AI was never on the roadmap. It was like, if you talked about that, it would be, “Oh, yeah. Ten years.” It’s always ten years away, kind of like cold fusion is always 20. Same story. Now suddenly, tomorrow becomes today, and here we are. Thanks for your time.
[00:34:59] My guest has been Mike Isbitski, director of cybersecurity strategy at Sysdig. I’m Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you’ve enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you’ve hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry, insulting comment that I will later delete because I’ll have access to your email inbox.
Join our newsletter
2021 Duckbill Group, LLC