Creating GenAI Teammates with Amit Eyal Govrin

Amit Eyal Govrin: So essentially, go and assign that entire end to end role. that can also perform in high velocity, high accuracy and high predictability. And at the day, be fully audited for compliance reasons. And of course, that frees up the humans and that frees up your team's time to go and to innovate.

Corey Quinn: Welcome to Screaming in the Cloud. I'm Corey Quinn. Unless you've been hiding under a rock somewhere, you've probably heard a fair bit about GenAI lately and how it is the savior slash doom slash hype cycle to beat them all. My guest today is Amit YelGovrin. Who is the CEO at Kubiya. First, thank you for joining me.

I appreciate your taking the time.

Amit Eyal Govrin: Thank you for having me here, Corey.

Sponsor: Complicated environments lead to limited insight. This means many businesses are flying blind instead of using their observability data to make decisions, and engineering teams struggle to remediate quickly as they sift through piles of unnecessary data. That’s why Chronosphere is on a mission to help you take back control with end-to-end visibility and centralized governance to choose and harness the most useful data. See why Chronosphere was named a leader in the 2024 Gartner Magic Quadrant for Observability Platforms at chronosphere.io.

So you have been, uh, obviously a little bit on one side of the GenAI, is it great, is it terrible, divide, given that you have a company that is selling something directly in the space. Take that head on, what are you building?

Amit Eyal Govrin: What I'm not building is an AI solution.

I'm building an outcome. And the outcome is actually, maybe if we take one step back, we can talk about what I'm trying to solve for, and kind of the paradox I'm trying to shatter. And talk about where GenAI can enable that.

Corey Quinn: That's a good way of approaching it. So often it feels like people are raising giant rounds just because, Alright, I basically wrote a Python script that, Step one, import OpenAI, and step two is, Meh, we'll figure it out.

Then they're shocked, simply shocked, When a feature enhancement OpenAI puts out destroys their company. Who would have predicted it could speak PDF one day? And yet, here we are. So what is the problem you're aiming at and what outcome are you going for?

Amit Eyal Govrin: That's a fair statement. By the way, I'll just acknowledge that.

Let's be honest. GenAI, most people discovered it under a rock about two years ago when OpenAI officially announced themselves to the world as chatGPT. Clearly, we've been doing this a little bit longer. Uh, we're doing, we're, we're working with all the, you know, up to date models, uh, training our own models, doing all the things that, uh, you know, you would expect an AI company to do, but that's not the topic of today's discussion.

Clearly, that's that's for people to geek out with afterwards if they want docs. Um, what we're actually looking to solve, because that's, that's really why people are listening. There's a concept called the time to automation paradigm. Are you familiar with it, Corey?

Corey Quinn: I think it's better if you explain it to everyone, because even if I am, I guarantee you, someone is not.

Amit Eyal Govrin: Actually, I want you to do the selling for me, so

Corey Quinn: Okay, on some level, if you wind up with It's a question of, is the juice worth the squeeze, longer term? The idea of how long do you spend automating a thing versus how many times do you do the thing? If you're spending three days to automate something, you do once a quarter, that takes five minutes to do.

Cool. Is that worth it? Well, the answer is, of course, it depends. But that's a loose definition of it. You probably have a better one.

Amit Eyal Govrin: Actually, that's a perfect, uh, layperson definition of it, but it's really the outcome and the effort that, you know, is the effort in time to automation. The amount of time it takes to write the script, the Terraform landing zone, the configuration file, and obviously maintaining that golden path use case versus the number of times it gets used or essentially the output you receive is congruent to one another.

And oftentimes you'll find that it's not. Oftentimes a level of effort and determination and obviously the ongoing maintenance it takes to automate an end to end process and the output that you receive isn't necessarily going to go and relate to business outcomes. And that's typically where you see a lot of organizations with a very clearly defined automation strategy.

Coming in with, we're going to set up an internal developer platform, uh, name it backstage or any other deviation of that. We're going to go and set up some kind of, uh, self service platform. They're going to go spend all this effort, oftentimes even the headcount associated with it, just to find out that after a year worth of, uh, A lot of toil.

They managed to set up seven golden path use cases out of which the first time a developer tries to go into self serve themselves into one of these automations, encounter some kind of configuration or access or permission issue, goes right back into the GR ticket queue. Back to the on call engineer and says, Guess what, buddy?

Enough is enough. Enough of this nonsense. I need a human to help. And that's typically, and it repeats itself in various different formats and different flavors and organizations, but that's the sum of it. The paradox, just like Joubin's paradox in many respects, if it takes longer to automate something than it is the output, it's not going to get done.

And what you find is many organizations start with the strategy, end up going down the line just to find out everyone's doing ad hoc scripting. Until what happens

Corey Quinn: You do tend to cut yourself to ribbons on the edge cases though. Okay, this only takes five minutes every morning to do, why spend a week automating it?

Well, it's done every morning, sure, but look at the defect rate when humans are doing this. Sometimes people are hungover, sometimes someone is out sick. If you can, If you can get a better outcome via automation, that does tend to put a thumb on the scale from time to time. But yeah, directionally, you're spot on.

Amit Eyal Govrin: No argument there. And I think where kind of the output and kind of the outcome base we're referring to, it all comes down to The amount of effort versus the output. If it was as easy to automate an end to end process as it is to have a conversation with, say, Bob. And Bob's your on call engineer. And every time you need something, uh, from a platform, you just go to Bob and say, Hey, Bob, can you go ahead and configure this?

Uh, this resource for me. Can you go ahead and grant me access or permission to this IAM policy or creative policy, elevated permission, or get approval for this resource that requires that approval for this? If you could do this as easy as having a conversation with Bob. Guess what, Corey? It's going to get done every single time.

So what we've created, Kubiya, and this is why you mentioned GenAI at first, and you kind of set this up for this answer, is if instead of Bob, you have an AI teammate called Kubiya, okay, or Kubi Jr. And in this case, you could have a full on discussion, conversation, bi directional conversation, and it is access aware or mission aware and able to really meet the users.

In the exact channels where they communicate and collaborate already, Slack, Teams, Jira, Kanban boards. Then you could have that personalized experience and release Bob to actually do the real work, which is setting up the infrastructure to tee up the company to have the next GenAI company, uh, product that they're going to go and roll out themselves.

Really, it's all about outcome and rewards. If you're going to want to go and put all this effort, it better be worth the reward, unless it's as easy as having a conversation. Then you break away from that.

Corey Quinn: The counterpoint, of course, is this has been teased at, and people have done a number of experiments, myself included, of, all right, I want a Python script to do X, go ahead and build that out for me.

And ChatGPT gets some parts right, some parts wrong, but it is very far away from being something that I could accept out the gate as meeting the acceptance criteria that I have for it. It, it feels like on some level, I like to dive right back into your paradox that I would spend more time supervising the thing than just writing the quick script myself in some of those cases.

How do you get around that?

Amit Eyal Govrin: So you hit the most important point. It's not just about doing the automation, because that's half the battle. It's about doing it in a way that's expected. controllable and fully auditable. And we're actually allowing that. We're actually using Terraform as our backend, in many respects, to give the user the ability to control every aspect of this interaction with, uh, with, uh, essentially, uh, the teammate that's configured with Terraform.

So you get to control the environment variables, you get to put the output, you get to control the permissions, and you get to control every single aspect of it. So as an operator, you're in full control. As an end user, you're interacting and having that LLM type of experience that people are accustomed to with chat GPT and other type of chatbots that they're comfortable with.

So you're getting the best of both worlds.

Corey Quinn: I want to dig in a bit on the idea of talking about this as a virtual teammate. Um, specifically from the perspective of, I don't know about you, but I have something of a potty mouth when I'm berating Siri when he gets something wrong or Alexa when, basically, I ask for anything and then I'm followed up with a, By the way, buy some more pants!

or whatever it is they're trying to sell this week. Um, if I talk to an actual colleague like that I, HR is inviting me to a meeting in which I'm not offered coffee and very shortly afterwards, I'm not allowed back in the office ever again. Uh, so there's a, there, there's a question of how much is this a accelerational tool for folks that need, that are, that are getting value from it versus how much is this actually intended to be a full on, uh, member, the fourth person on a three person dev team?

Like, is this a, is this a, I guess employee replacer isn't an augment. Where on that spectrum do you see it landing?

Amit Eyal Govrin: So just like you would say across all sorts of revolutions, industrial revolutions, where the humans weren't replaced by the smart assembly line, they became supervisors of the smart assembly line and managed to go and to reinvent their position.

And you could go down the list from the cloud and how you did things on prem to the cloud and how you went through all the different resolutions. At the end of the day, AI is a tool. It's an enabler. It's a megaphone. If your entire role in an organization is to move a pencil from right to left, then likely AI will replace you.

I'm sorry to say that, but if you actually have the capability of supervising and becoming an enabler of essentially a supervisor of agents, so think about this as. Up until now, you've been an individual contributor. Now you actually get to supervise your teammates. So you're a DevOps manager all of a sudden.

That becomes a completely different job title. And of course you get to see everything through and have the highest and best use of your time. Freed up for the things that AI aren't prepared to do.

Corey Quinn: And some of you are talking about moving commoditization a bit further up the stack. Uh, similarly to, used to be you had to run a bunch of compiler commands to get a web server, then to run your application, then it was just Yammer app installed, then in time it became, oh, now it's Docker run and you get the whole thing prepackaged, ready to go.

And you're spending more of your time, Trying to get the application to do what you want it to do and not get the application set up in the first place.

Amit Eyal Govrin: That's exactly what we're saying. You don't want to have to do the repetitive work that otherwise would have been better suited for AI. Free up your plate.

You have plenty of work to do. You probably have a big backlog that you haven't even gotten around to because you're behind the eight ball every single day when you start. There's a hair of fire drills.

Corey Quinn: Do more with less is the persistent rallying cry of our current industry and honestly our entire system.

Amit Eyal Govrin: There's case studies that are being studied in Harvard and every single business school. It's all about Blockbuster and Netflix, right? Don't be Blockbuster. Don't be left behind. If you know how to reinvent yourself and adjust yourself, AI is going to be the biggest enabler, biggest career boost you could ever have in your, in your career.

Otherwise, if you feel that you're perfectly fine with stacking DVDs and dropping them in the inbox every single Sunday when people have to return them, You're going to be left behind in the streaming, uh, in the streaming kind of movement will take you by storm. That's kind of what we're saying. Don't be left behind by AI.

Be, have AI be the enabler for you in your careers.

Corey Quinn: I don't necessarily disagree with the premise. I think that it is It is fairly clear at this point to most folk that there is value that can be derived from GenAI, whether it is this wild transformation of society to a perfect utopia. I'm a little bit of a skeptic, but it's, it's similar to, oh, I, I insist on doing long division the old way because I'm not a fan of these newfangled things called calculators.

Yeah. It, it acts as a tool that accelerates, but understanding when to apply it, how to validate the output that comes out of it, and to inval to ensure that it's not insane. is going to be something I think that we're stumbling through as a society. And, and in many cases, the hallucination problems, uh, aren't making a strong case for, let's turn the air traffic control system over to the GenAI and hope for the best.

There's a, I think that it's a matter of nuance similar to before this, People, developers would wind up using Stack Overflow, the world's premier copy and paste website, and use that to solve problems on an iterative basis. You can amalgamate that into various coding assistants and chatbots. Having them actually go ahead and do the implementation seems like the next logical step, but as always, there's going to be some question around the margins.

And how this, is this going to be something that we can actually trust? And if so, how far?

Amit Eyal Govrin: So the beauty of what we're trying to accomplish here, and it's not to let AI take over the entire end to end workflow and orchestrate the entire process. You can't avoid hallucinations. By the way, that's a feature within large language models, right?

It's a statistical, statistical based approach. Every single answer will deviate from the other answer. Every single time. What we're actually advocating for is to make it very controllable, very predictable, and that's where the Terraform code comes into play. And the AI enablement is essentially the free, the natural language interaction that you have, where you can go and abstract away the business logic of your intent.

And then work into that, uh, pre defined, pre gated workflow. So it's essentially combining the best of both worlds, both the known and expected structure, along with the, all the things we know and love about large language models.

Corey Quinn: I think that that's a fascinating approach. I mean, something that I've always done when I've been asking large language models is I won't ask for the answer.

Because, okay, you're going to give me an answer. Maybe it's right, maybe it's wrong, but regardless, you certainly sound very confident in what it is that you're saying. What I'll ask instead is for a script to go ahead and do the thing to get the answer out of it. Because that, from my perspective, that gives me two great paths.

One, I can see how it's doing that and potentially catch weird issues it's making along the way. And two, okay, that was great. I want to iterate on that now. I'm not going back to square one or trying to find the chat that generated that and then have it, uh, have it go ahead and pick up where I left off.

I find that that, the show your work stage and breaking it down into stages means that when it starts to go off the rails around step 17, you can go back to 16 and try again to get things moving along again. I think that that aligns with the approach that you're taking.

Amit Eyal Govrin: It's a very modular approach and the ability to go and to insert your own tools, your own scripts.

your own code as part of this to inject that into the process. Make sure that you control every aspect of your workflow. It's essentially your own words orchestrating essentially an end to end complex process that otherwise would have taken disparate tools and processes within the organization to accomplish in the same way.

It's just done in a highly, highly condensed time to automation, which is the beautiful part about it. We can go into use cases if it helps.

Corey Quinn: By all means, give me an example use case. I like, let's talk about something real, rather than the ephemeral vision of the developer of tomorrow. Let's talk about something that, uh, that someone might actually do.

Amit Eyal Govrin: One of our favorite, um, golden use cases, if you may, is, uh, one of our customers came to us to Enable a self service, um, infrastructure or resource provisioning platform. All within Slack, which is obviously where they meet their users. So, the concept is a user comes in, asks for, I don't know, a new SQS queue, for example.

And this is, uh, as part of an application that they want to copy over from one of their other resources. So, the ability for the teammate to first verify the identity of the user, verify that they have permission, maybe even create a just in time, uh, policy in order to enable that user to do so, but then also backtrace, uh, what the cost of this resource would be, because there's budget enforcement that has to come into play.

So if it costs more, say, than a hundred dollars a day, that requires some additional layers of approval, which again, the teammate could also go and get the right approvals for that. So the ability to go and to both enforce budget, enforce policy. And create least privilege, uh, automation without needing to assign a role.

That's already a big win for this organization. At the end of the day, they also care about cost. So not only are you enforcing the budget, they also have a cleanup process where after 30 days, you manage a TTL, you could actually configure that. 3 hours, 3 days, 30 days, it would go and automatically destroy that resource and bring it back to where it was before.

So you're never, you know, over provisioning or over resourcing, and it's all done as a simple conversation. This same process, if you would have copied that over to the way they currently do it, would take a matter of three to five days and five different people involved in the process to provision and de provision that resource.

With us, it's less than a minute.

Sponsor: Complicated environments lead to limited insight. This means many businesses are flying blind instead of using their observability data to make decisions, and engineering teams struggle to remediate quickly as they sift through piles of unnecessary data. That’s why Chronosphere is on a mission to help you take back control with end-to-end visibility and centralized governance to choose and harness the most useful data. See why Chronosphere was named a leader in the 2024 Gartner Magic Quadrant for Observability Platforms at chronosphere.io.

I think that there's a, there's also significant value in being able to spit these things out that then go through a somewhat normal production process where, okay, great, this works in a test account, it can go ahead and spin things up, whatever, ideally there are guardrails somewhere around it to prevent it from doing the psychotic things that make the headlines, but okay, once that's done, great, then having it be vetted as it gets promoted to higher environments and have humans weighing in on that.

does seem like a reasonable control because objectively there's not that big of a deal when you have a GenAI system hallucinating or being wildly inappropriate unless you're deploying that directly to customers without any form of human review along the way. I think if you put a chat bot on your website and let it and give it make it authorized to cut deals on your behalf and then it does horrific things, I think you're unhinged.

I think if there's a human review that goes through it to validate. It's on brand that it is doing what you want it to do. Well, that seems like a much more reasoned, rational approach. Maybe I'm just old and I have perspectives on these things that don't necessarily align with the rest of the industry.

But here we are.

Amit Eyal Govrin: And I fully agree, Corey. So from my perspective, and this is why I want to make sure we're on the same page. It's not by accident that we called it a teammate. It's not a co pilot. A co pilot just watches over your shoulder, does code completion, does, uh, By design, it's only limited to the human in loop interaction you're involved in.

Here, we're talking about the concept of delegation. We're saying delegation is new automation. If you could go ahead and just instruct one of your teammates, To go ahead and to solve the entire GR ticket queue and to come back to you and report back with the medium time to resolve every single ticket and trust that no longer do you need a human to do it, but you only need a human to supervise the outcome and to make sure that it's fully audited and compliant.

Then you just saved, potentially, dozens of hours from the human's, uh, week, work week. And, at the same time, being able to free that human up to do quite a few more important things that they have on their plate. So, essentially, go and assign that entire end to end role. That can also perform in high velocity, high accuracy, and high predictability.

And at the day, be fully audited for compliance reasons. And of course, that frees up the humans and that frees up, uh, your, your team's time to, to go and to, to innovate.

Corey Quinn: I think that that's probably a, a very fair way of splitting the difference. Now, the obvious question I have that I did allude to at the beginning, so many companies have seen is, Is this effectively a three line Python script that starts with import OpenAI?

to be shocked, simply shocked, when that company doesn't hold still and releases something new. What is the moat, for lack of a better term?

Amit Eyal Govrin: Well, the complexity of the infrastructure goes beyond this discussion. Or my acumen as a CEO, to be fair. There's, we're using over a dozen different language models.

Some of them we're fine tuning ourselves. Some of them we're training ourselves and some of them are GPT 4.0 Anthropic, uh, and so forth, but at the end of the day, everything is broken down to multi agent systems. So every single operation or task may be invoking a different language model. Just to give you an idea, Just to understand, if you're going to go and encounter an operation that requires interrogating a resource and doing Q& A, that would be invoking a different language model and probably a different agent than if you would, if you're going and asking it a question, or if you're going and asking it to provision some.

So you would potentially have, Three different paths you could go by, depending on the context of what you give it in the question. And that, for example, you're encountering with the classifier agent that knows how to classify the right agent that you would be routed towards, so it can go down that same path.

So, as an example, that's just one element of this. We can go into different workflows and go into how you go and seek approval. We have multiple agents, so approver agent isn't necessary the policy agent and isn't necessary the TTL agents. So each way you have kind of a Chinese firewall between these agents, so you can't brute force your your instructions and try to go into get information you want out of it that otherwise would have been a uh.

under some form of, of, uh, access, uh, control.

Corey Quinn: There's a lot of, I think, excitement around GenAI. And I get it. The first time I saw Chad Chippity do something, it was, it was magic. It was, oh, wow, I'm watching the future unfold. And it's rare you get those moments where you get to see it. Like, it, it reminds me of the first time I walked to an Apple store and played with an iPhone.

And it was, oh my God, this is so much better than the crappy BlackBerry I was using. There are those transformative moments in time. Now, whether it's worth the massive uprooting of everything and hurling down the well after GenAI, I don't know. But from what you're doing and from what I've seen of it, I think that you're definitely building something interesting.

What that turns into and how that winds up manifesting, I think, definitely will remain to be seen, but that's the nature of anything.

Amit Eyal Govrin: So I'm sided with everybody who I hate hearing about the next GenAI company, uh, raising their 200 million round based on a pipe dream. Then they, there's a huge difference and there's levels to this, Corey.

So there's a huge difference between people putting together a demo, people putting together a POC from the demo, and then people go into production and an enterprise grid environment. And this is effectively where we've already arrived. We have. And we can talk about that, but I'm not sure if this is before or after our embargo, but we have enterprises that are effectively, uh, in production working with us and enjoying the fruits of their teammates.

Corey Quinn: Yeah, I think that there are the proof is always going to be in the pudding for great. You can tell beautiful stories. I mean, I love the sound of my own voice. That's why I have two podcasts, but you can only go so far before. Having actual customers pony up and saying, Yes, this is valuable. This is something that we are, we are investing in.

And whether you think it's hokey or not, we're going to be spending a boatload of money on it. I mean, Kubernetes is a great example of this. I thought that was significantly overhyped in some circles, but everyone's using it at this point. I was clearly wrong. I'm wrong a lot. That, that's the best part about being me.

I know Amazon likes to say leaders are right a lot, but no, no, no. I like being aggressively wrong. But then adjusting my opinion in light of new information. What do you think right now folks are being, I guess, misunderstanding the most about GenAI's opportunity? What are they sleeping on that they perhaps shouldn't be?

Without descending into full on boosterism.

Amit Eyal Govrin: Hey, go big or go home, right, Corey? That's a model. GenAI, it's a very powerful technology. But it's not an end all, be all. It's not the panacea, right? You need to have a clearly defined pain that you're going to solve. You need to have a clearly defined path and clearly defined architecture to get to that path.

Until that all aligns and all the stars align, everything else is vaporware. It really is. And this is where, you know, we kind of talk about separate the men from the boys. There's very few companies in production working with enterprises in, in GenAI applications. We're one of those. We're not the only ones.

I assure you there's others that are coming up. But at the end of the day, the proof is in the pudding. We give a guarantee on our product. We even let them opt out after three months if they don't necessarily enjoy that experience because we are outcome based. If you're going to go and you're going to enjoy the fruits of our labor, you're going to pay.

If not, take your money and leave part ways, come back to us in a year when you think you're ready. When you think you have a better way of doing it.

Corey Quinn: Yeah, I have little interest personally in taking money from people that aren't seeing value in return for that. It's the, I'd rather lead to a good outcome because then it turns out it's a hell of a lot easier to sell to an existing customer than it is a new one.

But if you wind up basically leaving them feeling fleece, you don't really have much of an option to sell a part to.

Amit Eyal Govrin: Word to mouth doesn't travel very well when that happens, right?

Corey Quinn: No, and what is it, like, bad news travels 10 times faster than good news? Yeah, I've seen that all the time. Whenever I'm cynical on Twitter, I wind up getting an awful lot of, uh, attraction out of it.

But if I say, this is surprisingly great, no one cares. No one wants to hear positivity. They want to hear the overwhelming negativity aspect, and we're now doing our best as a society to algorithmically boost it. But here

Amit Eyal Govrin: we are. That's why we've been very cautious not to overhype what we've been doing until we have proof points and social proof for this.

I'm not going to pick on Devin. They did an excellent job trying to be pioneers in this space. But let's face it, they probably should have been a little bit more cautious before they release their videos and kind of the boosting about what they're doing, because at the day, they fell where a lot of companies are falling, being a, having a controllable, Software, autonomous software engineer requires to have actual controllable measures in place.

I don't think they've done that. Maybe they will with the 200 million they just raised, you know, best of luck to them. Uh, we, we don't have the privilege of raising 200 million, but we have the privilege of knowing exactly what we're doing and how to go into, to tame the larger, large language models and to behave the way we want.

Corey Quinn: I really want to thank you for taking the time to speak with me about this. If people want to learn more, where should they go?

Amit Eyal Govrin: Well, I guess you can, uh, look at my shirt, but I don't know if I'm high up here. Uh, could be I I K U B I Y A dot AI. Happy to have anybody ask questions. We, we have a chat bot on, on, on our website, but you could also sign up

Corey Quinn: because of course you do

Amit Eyal Govrin: waitlist and we're happy to answer your questions.

We have a support channel as well. So very happy to take questions and. I appreciate your time.

Corey Quinn: Of course. And we will put links to that all in the show notes. Amit Eyal Govind, CEO at Kubiya. I'm cloud economist, Corey Quinn, and this is Screaming in the Cloud. If you enjoyed this podcast, please leave a five star review on your podcast platform of choice.

Whereas if you hated this podcast, please leave a five star review on your podcast platform of choice, along with an angry, insulting comment that I will just assume was written by a malfunctioning chat bot.

Join our newsletter

checkmark Got it. You're on the list!
Want to sponsor the podcast? Send me an email.

2021 Duckbill Group, LLC