Practical Applications and Future Predictions for AI Security in 2024

Jun 17, 2024

View Show Notes and Transcript

What is the current state and future potential of AI Security? This special episode was recorded LIVE at BSides SF (thats why its a little noisy), as we were amongst all the exciting action. Clint Gibler, Caleb Sima and Ashish Rajan sat down to talk about practical uses of AI today, how AI will transform security operations, if AI can be trusted to manage permissions and the importance of understanding AI's limitations and strengths.
‍

Questions asked:
‍00:00 Introduction
‍02:24 A bit about Clint Gibler
03:10 What top of mind with AI Security?
‍04:13 tldr of Clint’s BSide SF Talk
‍08:33 AI Summarisation of Technical Content
‍09:47 Clint’s favourite part of the talk - Fuzzing
‍15:30 Questions Clint got about his talk
‍17:11 Human oversight and AI
‍25:04 Perfection getting in the way of good
‍30:15 AI on the engineering side
‍36:31 Predictions for AI Security

Ashish Rajan: [00:00:00] Hey, it's editor Ashish. Just wanna give you a heads up on this particular episode. This was recorded at BSides SF and it was loud. We tried our best to make the audio great for you because the episode by itself is quite valuable. It's a talk summary from Clint Gibler as well as Caleb Sima's keynote at BSides SF

I just wanna apologize upfront that, hey, we tried our best to make the audio as seamless as possible. You can hear everything. It should be fine,. Obviously it's a conference, so it's a bit of background noise. I hope you still enjoyed the episode. There's a video and audio as always, but I think it's still valuable if you are trying to learn about AI security.

Just want to give you a heads up. All right, let's start the video.

Caleb Sima: You got the vomit? Yeah, you

Ashish Rajan: got the vomit.

You know what you had? You had vomit.

Caleb Sima: Vomit.

Clint Gibler: Very familiar.

Caleb Sima: Should AI automatically have the ability to accept and allow these permissions instead of people? And most people instinctively, as security people, hell no. You can't just allow an AI to approve permissions.

That's a recipe for nightmares. But yet, I think today's state of [00:01:00] this is worse, actually, than what an AI would probably auto approve these things. Probably better than the way that we're delegating and doing access permissions today.

Clint Gibler: I think that People will start thinking of AI more as like a tool to solve problems rather than just throw AI at everything.

Ashish Rajan: AI for security and security for AI. What is it tomorrow and what is it today? Today I had a conversation with Caleb and Clint Gibler here in Bsides SF where Clint did this awesome job of going through almost 2, 000 plus resources. to identify what does AI look like today and how cybersecurity can use it.

Caleb on the other hand did the keynote for BSides SF and spoke about what is AI today in terms of security and what it can be tomorrow. We did the all flavor jelly beans as part of making it fun for both the cohost and the guest which is Clint. Overall it was a great conversation. We spoke about where do we see AI today in terms of security?

What can engineering be doing today that we could bring it back and hint for people who probably don't know what waterfall is. Waterfall may be making a comeback with AI. So all that a lot more in this episode of AI cybersecurity podcast. If you're here for a second or third [00:02:00] time, we would really appreciate this if you share this with someone who's trying to figure out what AI security could look like in say the next two years, if you're watching this on YouTube and LinkedIn, definitely give us a follow subscribe. It means a lot. And if you are listening to this on Apple and iTunes, definitely give us five star ratings and let me let more people know about the podcast as well.

Welcome to AI cybersecurity podcast at B Sides SF. Today's edition, we're having , all flavor jelly beans as our theme today.

Clint Gibler: Hey, my name is Clint Gibler. I am the head of security research at Semgrep. So we're an Appsec company building static analysis tools or basically ways to find vulnerabilities in either first or third party code or find secrets, things like that.

The reason I'm interested in AI is I'm always fascinated by the latest and greatest in security. I think reading what people are doing and how they're innovating is just personally one of my favorite things. That was what I enjoyed about grad school, and then now running a newsletter too long didn't read security or tldrsec.com here basically every week I devote way too much time to reading about cool stuff happening. And, yeah, so when I saw more and more things happening in the [00:03:00] LLM space or AI space. I was like, yeah, this is very interesting. I'm also friends with Daniel Miessler and obviously he isa 110 percent on board to that so it's hard not to have a little bit of that grimoire on you

Ashish Rajan: What does come to your mind when you talk about AI security and are you focusing a particular layer of that as well?

Clint Gibler: AI security is too ill defined because that could mean at least seven things or the thing that my talk was on that sort of I focus on is applying AI to cybersecurity problems. However, there's many other things. For example, like how do we secure our usage of AI or LLMs? So whether that's at the application level or there's various supply chain attacks in terms of poisoning the model.

Or any of its dependencies or hugging face. A number of the models are just like pickle files. Pickle can run arbitrary code. So there was a couple of either POCs or even maybe real examples of people uploading malicious models, which then people would run, obviously models could potentially leak PII.

So there's like many, when you say. AI or LLM security, right? There's like this many things that are sometimes [00:04:00] overlapping, but sometimes not, but yeah, for me as a cyber security practitioner and something I'm interested in, I think securing AI is super important and there's people doing great work on it, but given I only had, four to five minutes to give four hours of content, I should be very specific.

Caleb Sima: Yeah. So there's. Securing AI, and then there's AI for security. And clearly, AI for security is the bucket at which right now is what your talk is focused on. So give us a little bit of a rundown, what's the short version of the talk. But before you do that You have to answer. Oh, I gotta eat these things.

By the way, is this our first, this is our .First in person recording?

Clint Gibler: Yeah, it is our first in person recording, that's why. Okay, you did a this one? That one's pretty good too. Yeah. I'm nailing it so far, Caleb. Yeah. So far, so good. I

Ashish Rajan: think, I feel you're cheating. I, dude, I'm reaching in the box, and I'm just grabbing.

Alright, next time, I've got different flavors. Alright. Alright. Okay.

Clint Gibler: Okay, I'm going Oh no, which one was it?

Ashish Rajan: That one's really good.

Clint Gibler: Mine tastes interesting.

Ashish Rajan: I don't want to tell you what it is.

Clint Gibler: It's not like a sweet one.

Ashish Rajan: You know what you had? You had vomit. Was it pink?

Clint Gibler: Vomit. Very familiar.

Ashish Rajan: You got the vomit?

Yeah, [00:05:00] you got the vomit. So

Clint Gibler: yeah, the point of the talk was, every week I'm reading tons of things that people are doing in really, for me, for my own use. I was like, okay, I need to try to synthesize all this and put it all in one together so I can look at themes and patterns across many areas.

And I think both of you are probably very familiar where there'll be some idea or some technique or some approach that's like prevalent in one area of security or even in engineering, but it's not in security or like a different area of security. So I was wanting to provide this like meta insight into oh, cool, this is like a type of thing.

Where else can we apply this? So I tried to cover as many areas as possible application security code review, fuzzing, going from English to some sort of domain specific language, like a security query language, whether that's semgrep rule, nuclei rule. Part of this was also a survey to try to be like, as of today, given current approaches and particular

Caleb Sima: research.

You were taking like, hey, out of all the things at which I've seen, these are now the patterns Oh. Converting of your custom query from natural to your [00:06:00] custom. You're basically taking all these very practical things that you're seeing real life happen in products and pulling them into the things at which you talk about.

Is that right? Yeah. Okay.

Clint Gibler: This is like a little bit meta, but maybe it's interesting for both of you as also people who create and share content. There's a couple of ways to structure a talk or a blog post or something like that. One is like a top down approach where you're like, here's what I want to see, here's the message, and then let me find a bunch of examples and supporting evidence.

Another way is almost a bottoms up approach where you gather a bunch of resources and then you put them together, group them into themes, and then the talk comes from that. So I did a little bit of both, but I think this talk was maybe a little bit more than usual where I tried to, summarizing things week by week.

I was like, okay, here's 400 links I've summarized. Let's like, group them into what the most similar things. And yeah, that English should have been specific language was one that was very interesting. And I think applies to any area of security. The core intuition, where sort of the meta takeaway I had for that was like with any tool, oftentimes there's these very powerful capabilities in terms of like slicing and dicing data very intricately, but you need to learn [00:07:00] to do that, either whether that's writing a very fancy SQL query or using some custom syntax you've never learned before.

So like before you get the most value possible, you need to put in the work. So can we lower that barrier to make say, totally new people perform at an intermediate level immediately.

Caleb Sima: You as a vendor or a product or a technology stack will have your own custom language, because that's probably the most efficient way in which you can get the job done.

However, today, or quote unquote today, you had, as a human, had to go and understand, get used to that language, figure out how to work. Similar you're writing Splunk queries. You need to know how it works. Versus In AI of tomorrow, now you can just talk in your own language, and that translation will just happen behind the scenes.

So now that lowers that bar to where, hey, you don't have to have the challenge or the friction to go learn this thing, you can just talk to it in the way you need to.

Clint Gibler: Yeah, and I think there's a couple of specifics about why this is a great use case. So one, LLMs are great for like translation in terms of going from English to like another sort of syntax things that [00:08:00] understands.

Another is that going from scratch to maybe a very detailed Splunk query but if you have something that's 80 percent of the way there and you just have to tweak it, you're like there's this extra thing that it forgot That's actually very easy. Similar to having a an empty Word doc versus having an outline.

It's yeah, it's easier to edit than to write it from scratch. For example, if you want to turn off something in the cloud because you're like, Oh, this is bad. That could be bad for production, right? You want to be highly confident you're doing the right thing. Whereas when you're just generating a query it's totally fine if you're wrong.

Caleb Sima: So yeah, the fact that it's good being non deterministic, you need to find the right use cases at where that is okay.

Ashish Rajan: We were trying to do some one of the episodes there. We had the OWASP top 10. We had all the other reports that we were trying to use ChatGPT Gemini, and Claude to summarize.

Anyway, I'm not saying that they did a great job. They're an okay job. Out of curiosity, in the research that you did with 2, 000 plus links that you had to go through, and as a human, make some sense of it, did you consider using an LLM model for it as well and what was the result like while you went down that journey as [00:09:00] well?

Clint Gibler: Yeah, I've played around a bit with AI summarization of technical content specifically. Sometimes I think it does actually a very good job where I look at what it writes and then I'm like, Oh, that's actually very similar to what I would write. Sometimes it's like clearly very off, but I think the more concerning part is where sometimes it, it sounds good, but then like I actually read the article and I'm like this is actually technically an inaccurate understanding of what happened.

If I were to just read that and not read the article, I bet it seems like it makes sense. But I could get, from a technical point of view, misrepresented what the work was actually saying, or how it worked, or some of the methodology. So perhaps as models get better, maybe that won't be as much a problem.

At least for me, writing a security newsletter, people are like, Oh, I put my reputation on the line when I recommend it to people. I don't want to make them look bad. So I'm like, very careful about it. So I would say, it can be very good at summarizing, just sometimes it's obvious when it's wrong, sometimes it's subtle.

Caleb Sima: I have two questions for you. Out of your talk, what part of your talk is your favorite part? The one at which you're like, this was my favorite part to really talk about, I wanted to express. And then number two, after you've given [00:10:00] your talk, what do you think were the ones at which people who came up to you afterwards talked about the most as part of your talk?

Clint Gibler: I really liked the fuzzing section. I'm not a big into fuzzing person. I do more of program analysis static analysis. So static analysis, or dynamic analysis, there's a lot of okay, cool, how do we reason about programs, how do we understand how user input flows through the program to a potentially vulnerable point.

There's a, like an elegance of understanding how things work, and you're like, this is what a vulnerability looks like, and then you find it. Whereas with fuzzing, you're just like throwing a bunch of input at it. And then you're like, Oh, and it crashed.

Ashish Rajan: Chill.

Clint Gibler: In my mind, like the analogy I was going to make is like encoding a lot of intuition into it is maybe like a static or dynamic analysis approach.

Fuzzing is just make the model bigger, baby.

Caleb Sima: The more crap you can throw at it in the most different ways that you can is what is all that matters. Naturally, the way that you do a fuzzing today is you do have to give it some structure to understand the protocol. Yeah. So your point is Hey, there is a benefit of not understanding protocol, [00:11:00] and that will get you the surface level breakage.

But in order to get deeper into the application, you need to abide by a certain protocol, so that the logic gets to the point at where it could break. And you have to have some sort of protocol defined to do deeper fuzzing. So then the question is, okay if I'm going to use AI, do I use AI and give it the structure that I think it needs?

Or, if I give enough training and enough data, to your point, will it naturally figure out the protocol and naturally, over time, get to the deeper parts because it will figure out what works and what doesn't work and it will adjust itself accordingly to create and communicate with it?

Ashish Rajan: I always thought of fuzzing a bit differently.

I was thought of fuzzing as, on the air and see which one because to get to that structure as well, I should have done some research beforehand to know that, okay, any PHP app or any node JS app or whatever, before I get to the point, Oh, okay. So it's looking for, I can try a SQL query fuzzing or SQL injection does it work or not, cross side scripting or the known pattern? Is [00:12:00] that what you mean by protocol? Or is that more,

Caleb Sima: I'll give you an example. Just take a TCP. Yeah, right? Yeah. Three way handshake. Yeah. You have to establish a three-way handshake before a communication gets established. That's right. So in a fuzzing scenario you can fuzz SYN packets all day long.

Yeah. But at some point, a SYN packet then has to come back and you have to get an acknowledge. You then have to complete it before you can even start fuzzing the actual. Rest of the protocol, right? So if you're fuzzing something, there's barriers you have to get past in order to fuzz the next level of logic.

If you just throw a fuzzer at hey, let's fuzz a TCP thing, it's gonna have to know enough about the protocol to figure it out. Or, if you just say, figure it out on a, does it work? Oh, I have to do a SYN. Oh, I have to get an ACK. Oh, I have to do SYN ACK. I have to, you know Does that just naturally occur in order for it to reach the level needed?

Ashish Rajan: And did you find in the research that it I mean, where are we? Because to your point, obviously, you did a lot of [00:13:00] references, 2000 plus references. Was there Almost to make it easier for the audience to understand, you had them in separate okay, think about this from five buckets or six buckets, and this is what I gather from the first pattern that came out of this bucket was this second bucket is is that how you tructure?

Clint Gibler: I'm, yeah, Caleb's totally right. Like how do you get deep into the program to find potential bugs there? And LLMs are very good at writing code, and one of the core challenges in fuzzing is you need either like inputs, for example, if it's a GIF.parsing library, you need a bunch of GIFs that exercise all the functionality, or if it's like a library, you need basically test cases that, when run, execute more of the program.

So LLMs are good at writing code, for example, Google runs OSS Fuzz, where they fuzz open source projects at scale, and they have this really cool approach, and they actually released a tool on GitHub, you can check it out, where they basically look at the coverage report from their fuzzing efforts, and then they're like, oh, we see this function isn't being exercised, let's see if an LLM can automatically write code that then exercises the function that's not currently being crawled and then feed that into the fuzzing so then it's [00:14:00] like Boom. The core difficulty in fuzzing. It's almost like free bugs in a way. If you have basically test code that exercises is lost to the program, and then the idea is that takes a lot of human effort and is hard and time intensive, like an LLM can just look at the source, look at the docs, look at example code, and then programmatically write the test cases.

It's basically like removing the main hard part to getting free bugs, at least with fuzzing. So I called out like four or five different approaches that did that.

Ashish Rajan: That are available today that we've been used to start implementing this.

Clint Gibler: Yeah. There were a couple of tools that were open sourced. Yeah. I don't know if everyone had a tool, but yeah, there were at least a couple of things people can start playing with today.

Ashish Rajan: Oh, wow. Also, yeah, going back to the buckets that I was talking about in terms of different patterns how would you describe that to the viewers, obviously put a link to the talk as well when that becomes online, like how would you describe, what are some of those patterns or buckets that you were looking at in your talk, describing to the audience?

Caleb Sima: He still hasn't answered my question, he hasn't answered my second, I had a follow up question. Sorry, second question.

Ashish Rajan: On that note alright you're no. [00:15:00] Ow! Oh, it didn't. Oh, okay.

Clint Gibler: Part of me is curious and wants to try a new one. You should try a new one. You haven't tried that one yet. I haven't tried that one yet. Yeah. I feel like you know what

Ashish Rajan: it is. No, I've done it a few times. I know which one it is. Oh, by the way, for the other one. This one is not a good one, by the way.

Oh.

Caleb Sima: What color was it?

Ashish Rajan: What color was it?

Caleb Sima: It was like pinkish with dots on it. That was vomit. Was that vomit? Yeah. Yeah, that is not good.

Clint Gibler: I'm also going to point out that you had a couple of them in your hand, Ashish, and you pointed to that one to give me.

Caleb Sima: And you also said you know which ones they are.

Out of the people that saw your talk, what do you think their most questions were about or their favorite parts then?

Clint Gibler: There were a couple of things. So one is somebody was asking me about not just applying AI to cybersecurity, but just overall broader project improvements. One thing that I really liked about the question was he said You know hey, I want to be like a business enabler as a security person.

How do I provide value, whether that's helping engineers ship more quickly and just all the things that I think modern security teams think about, which I thought was awesome, which I didn't really address in my talk, but I thought your [00:16:00] talk did well in terms of Hey, where does AI fit in the business?

Not necessarily there just specifically security applications. So I would say streamlining organizational things. I talked about that a little bit open AI. Karthik from their internal security team gave a keynote a bit ago, and then the open AI folks gave a talk, follow up on BSidesSF yesterday.

So one example is let's say you have a security Slack channel. People are coming in asking questions. Probably there's a couple of questions that get asked often. So why not have some sort of FAQ or like some RAG system that be an AI oh you want access to this sort of system? This is how you do it.

Or you're building a new microservice in Golang this is like our template for how to do that. This is like just in time education or maybe your bug bounty program gets a lot of submissions. A lot are out of scope. You could have an LLM like look at the submission and be like, yeah this looks like it's targeting an asset that we've explicitly called out of scope in our policy so maybe auto close it or message them. So one question about AI within an org, not just security, like streamlining operational, reducing toil, maybe? Yeah. And then another was, given that current models [00:17:00] can hallucinate or are inconsistent, or maybe not fully trustworthy to totally rely on today, does it make more sense to have them augment people versus being a agent that can take action itself.

Caleb Sima: Yeah. I guess the question is where is the difference, right? Is human in the loop counted enough as, hey, that's enough of an interaction? Or is it like, Hey, we can't trust an agent to do 30%, 40%, 80% of a thing.

Clint Gibler: Yeah, I think it depends on the use case, right? There are some things that are like, you have to be right 100 percent of the time, and if you're not, it's a big deal.

And there's other cases where, for example yeah, English to rule is it's okay if it's sometimes wrong. Yeah. At least in my experiments, sometimes you'll be like, I want you to do this and then give me JSON output. And like 99 percent of the time it gives you JSON output, but 1 percent of the time it won't.

And then you have to build error correction stuff around it, and you're like, Why did you do this right so often, but not depends on the use case, I think.

Caleb Sima: Yeah actually I was having a discussion with some people about this yesterday, around access permissions, right? Which is okay, when you do user access reviews, today you get this thing that [00:18:00] says, Hey, Bob wants access to this system.

And then what do you do? You either say, Okay does Bob need access? Why does he need access? Who needs access? I don't know any of that. So what I'll do is I'll delegate it to the closest owner of the thing that, which Bob wants access to. That person probably also has the same questions as well.

I don't know Bob that well. What does he need access for? Why does he need this access? And then the question is Okay given enough data, Contacts. I should just approve it. Yeah. And what happened to say is most people just look at this and go, okay, as long as the employee see is it actually an employee of the company?

It looks reasonable. You just approve. Yeah. And so even in our permissions, the way we have permission is very vague, oh, that's weird. Maybe not. But, generally, yes. So then, what happens if AI, should AI, automatically have the ability to accept and allow these permissions instead of people?

And most people instinctively, I already see it with you guys, as security people, hell no! You can't do that. You [00:19:00] can't just allow an AI to Approve permissions. That's a recipe for nightmare. But yet I think today's state of this is worse actually than what an AI would probably auto approve these things.

Probably better than the way that we're delegating and doing access permissions today. Yeah, so it's a sort of Hey, do you need a person in the middle, in the loop for this? Or actually, it is vague enough and better than today's state to allow AI to make these decisions.

Clint Gibler: I'm gonna surprise you by, I think, agreeing with you.

Oh. What I think would be good about having an AI auto approve those things is because if you were to do that, what it would force you to do is to have actually write a detailed policy for this is the criteria we look for, this is reasonable, we think this is low risk, this is medium, this is high, where I think, yeah, it's like very fuzzy, hand wavy, it's probably not written down in most places, but I think it would actually add a lot of organizational maturity to be very specific about we have this prod database, super critical, you need to have a hard assurance, like only these three people are ever allowed and only temporary and [00:20:00] like you would have like a detailed decision tree for this is okay otherwise too high risk whereas it's yeah you need like temporary admin to the dev cluster that is totally separated from even staging and prod

Caleb Sima: going back on our earlier conversation, you actually shouldn't do that, because what you're trying to do here is give enough structure to the AI and to make decisions versus why not just give it enough data and enough capability where you may not even define that at all.

Should an AI be able to learn, hey, this is a production cluster, this is the types of accesses we've seen, given enough data over time, it should then make its own sort of logic and reasoning actually earlier, which is why try to give it structure. Actually, all it needs is more data and it will do better at giving permissions than you trying to give it any decision.

Ashish Rajan: But would you guys say are we there yet? Because the reality of what people don't really start, maybe start this conversation was , even with the fuzzing example that we went back with. I feel a barrier to your point in auto approve is [00:21:00] the same as barrier for me sitting comfortably in a self driving car.

As much as I know it's going to do the right thing, but there was a cruise thing that happened a few weeks ago or whatever here. I feel like the barrier is not going to be the fact that I'm okay for auto approve. It's more the first goers or whatever the first people who want to be, yes, I like the idea.

The biggest challenge I feel there would be the fact that a) acceptance, broader acceptance of it's okay for me to bring my Slack data, my Jira data to know the fact that request that came to the user access to what you're, what you showed in your example is completely fine because, hey, Ashish requested is from the same org.

It was Slack message about it in the Jira ticket race for it. All that context. I don't know. If we were, at least for now, the way, and you can call me a skeptic if you want, in terms of how I look at the trust level that I'm putting on my colleagues in my company, they will always what's the path of least resistance, or to approve is the path of least resistance.

Caleb Sima: Actually, I think in Clint, your talks are really more about AI today and real [00:22:00] applicability, right? Yeah. My talk was a little bit more any year to two years, but yeah, auto approving permissions, I personally believe is a little bit more of a year to two years away versus I don't think today, I don't think we've gotten a level of applicability.

Ashish Rajan: Even our models they're not ready to do that as well. Even if you were to find a company that's like forward thinking enough that I'm going to go all in and do the auto approve. As people who are probably talking a lot about AI security, do you think we would be comfortable in telling someone that, hey, LLMs are there at the moment?

Caleb Sima: I would, given enough data. Yeah, because here's what you're going to do. We're going to do it just like we do in Tesla self driving cars, right? You're going to get in the car, you're going to turn the thing on, and you're going to hold your hands over that wheel. And you're going to be like really tight.

Yeah, I don't know okay, get it back, okay, turn it back on, and then, You know, given enough over a month, over two months, over three months, you trust, right? You learn, you trust, it's making the right decisions, and then pretty soon you're on your phone, even though you shouldn't. But that's the way it starts.

Similarly, I think in any of these things, you're going to be, I want human in the loop every single time. Do the click click, and then over [00:23:00] time you start trusting. Hey hey, if it's error rate is like five percent, that's phenomenal. Yeah. Because today, if you look at the current state of today, current state of today, no one can measure error rate.

But if you probably did, it's probably not great, right? And people are just, and you have no confirmation whether the person you delegated to really made the right decision or didn't.

Ashish Rajan: In the cloud space recently on Cloud Security Podcast, we were having this conversation about least privilege. Yeah.

It's thrown on the wall all the way. That's yeah, we should do least privilege. Oh yeah. A hundred percent least privilege. I totally believe in that. But what does that look like in true reality? It's hard to define in terms of Ashish today is a security architect.

Tomorrow becomes a CISO. The level of access , may be very different. What does least privilege look like according to me personally versus the organization versus based on all the data is a lot more different. I feel like to me, to what do you were saying as well? Just to add to what you say that initial phase of Hey, I'm going to hold on to the steering wheel.

A lot of people have started talking about the fact that, Hey, maybe we can do auto remediation. If someone does come with a request for access [00:24:00] management, we can look at that. If it's not prod, it's okay. If it's dev, it's okay.

Caleb Sima: Yeah. And I actually Again, talking about current state, if you look at accidents that have occurred in AI driven cars versus, let's say, normal accidents that happen in human driven cars, , what's the difference?

Clint Gibler: Yeah, I was going to say, you were saying before I'm also worried about being in a, one of these cars and it gets hacked and then, some nation state is I'm tired of Cloud Sec Podcast I'm driving.

Ashish Rajan: Damned accidents. tldr newsletter

Clint Gibler: They talk down to me one too many times.

At least. the data I have seen, I haven't looked into this in a ton of detail, but I think per mile driven, I think Cruise or some other ones are safer than humans driving.

Caleb Sima: Correct.

Clint Gibler: And what's interesting is, yes, the accidents do get a huge amount of media coverage, but nobody is Hey, another human driver got tailboned on

Caleb Sima: third and fourth,

Ashish Rajan: yeah.

Because, yeah,

Clint Gibler: that happens like a hundred times a day, every day, in every city in the U. S.

Ashish Rajan: Or someone drove into a building even those scenarios as well that doesn't happen from a, at least I haven't heard of a scenario where a self driving car drove into an actual [00:25:00] building for whatever reason.

See that more humans doing it than actual self driving cars.

Caleb Sima: Which, if I were to abstract this a little bit, I also feel we as security, as an industry, and this is going to be a very high level abstraction, I think is, we make perfection the enemy of good. We're so used to, and I think this just comes from our, the core of how we're always so about poking the smallest hole.

Yeah. Any risk. That's just what, we've Come from. Yeah. And doing that without realizing that this is variable and the variability is okay,

Clint Gibler: like maybe the current state of the world is 20% in your company. And then someone is like, Hey, we can do this and it gets us to 70%.

And maybe somebody is no, it's not getting us to a hundred percent. We should do this other thing. That's a. going to take four years to get us to 100 percent and you're like, yeah, but in two months we can get to 70%. Why not? Way better. I know.

Caleb Sima: It's

Clint Gibler: okay.

Caleb Sima: Yeah. And it's basic, boring things that we should go focus on, right?

Like it's not that hard to go do this.

Clint Gibler: Which is one thing I really liked about your talk, [00:26:00] actually where you were talking about, I think you claimed in your experience, many breaches are as a result of a coverage, whether that's a breath or depth. Yeah. Where it's yeah you had the right tool, you had the right alerting, but someone used their personal credit card to spin up a new AWS account.

Who saw it? Yeah. Yeah.

Caleb Sima: That's right. It's, because I do, and in fact, I was even saying in, I said in my talk, I could give an entire separate talk just on coverage alone. Because the amount of incidents, like, when I was at Vendor, I decided to go, Become a defender and work at a company because I wanted to know why were breaches that I read about all these simple things.

Yet it seems like vendors are building technology that can solve that. Yeah. So why does that happen? Why do we have all of these vendors that clearly can solve that? And yet this, and then people will say guaranteed. That's because it's not a technology, which is true. Those it's a people, process problem that is also true. And also I think it's a coverage. Yeah. It really is oh, the reason why that S3 bucket was open isn't because we don't [00:27:00] have the right SCPs or people pentesting or scanning our stuff or we don't have a CSPM, it's because that bucket was actually a partner's bucket that we don't have coverage over.

Now we weren't aware it had our stuff in it, yet people were putting stuff into it. It's a gap in coverage. It's all the things that fall through the cracks. Yeah.

Clint Gibler: Everyone uses MFA except the contractors who have access to sensitive things.

Caleb Sima: Who we had to make an exception for in order to do this, right?

When did we make an exception for this? So that was like three months ago. Yeah. Did it get to me? No, because you're the CISO. It can't go all the way to you. Yeah. Like, where did this come from?

Ashish Rajan: Yeah. Even a scale challenge as well, because after a certain scale, it's nearly impossible for you to know.

What each of the developers are doing as well. If you have 10, 000 developers, how do you keep track of every single thing, every single open source experiment they're doing? Which way? You can't.

Clint Gibler: Another example of promising application of AI across a bunch of different domains is if you think about it, as a security team, you want visibility.

What's happening? There's a lot of stuff there, but you don't have time to look at it. [00:28:00] So but what if you had, I think you were 10,000 junior, 10,000 engineers, junior engineers. Yeah. So it's yeah, you annoying.

Caleb Sima: Smart 13-year-old. Smart 13-year-old.

Clint Gibler: 13-year-old. So you could have them look at like continuously monitor every scope doc or every read me of every project that's oh, like this used to be a read only web service, but now you're accepting file upload.

Yeah, that's interesting. Or, hey, this one thing that in the security architecture assumption that we were like, oh yeah, this is fine for now. You were like, yeah. Only this thing is going to talk to this thing and that's all it's ever going to be. And then it's Oh, there's this new thing we're spinning up that's violating that assumption we previously had.

You can imagine almost building like a knowledge graph of scanning the repos. Okay. This is, seems to be the purpose of this repo. Oh, we're reading all the wikis. We're reading every PR. Yeah. One thing that I think would be cool is yeah, just like for every PR or new thing, just yeah.

Clint Gibler: Does security care about this? Yeah. It doesn't

Caleb Sima: really care. That's really

Clint Gibler: Oh, this is a, you changed some wording in the footer. Yep. I don't care. Or it's oh, you added a new round and it's unauthenticated. Yeah. That's interesting. Yeah.

Caleb Sima: These are the, and I think this is very viable. And it's a very simple

Clint Gibler: [00:29:00] thing to do with a prompt.

I think you could do this today.

Caleb Sima: You could do this today. Yeah. And that thing about coverage is it can go everywhere. Yeah. You think about depth where I was even saying every single person I know has a vulnerability dashboard where they know their mediums and lows. Yes. Every, every company today has a alert dashboard for their detection and response team that they ignore 80 percent of the alerts that are coming in.

Yeah. It's not that they're false positives. Yeah. I didn't get anything. I

Ashish Rajan: thought you

Caleb Sima: got something. Yeah. Yeah. I mean

Ashish Rajan: You're like, Oh, yeah, but I think everyone makes that face as well. And you hit the nail on the head there because everyone agree you're on the money there though. Because every time you raise it with another application team, is that a medium?

Why are you bothering me with a medium? Yeah. Why are you bothering me with a low?

Caleb Sima: That's right. The thing is, these are important signals to get. You need to chase the signal, right? Or you need seven of these signals in order to determine whether it's a event. Hey, there's a large query on a database.

Is that bad or good? Should you get that signal? Yeah, you probably should. But someone needs to [00:30:00] determine whether that signal is a bad or good signal. And that requires a lot of work. Yeah.

Clint Gibler: Because if all the time, There's these queries that are retrieving a ton of data all at the same time here.

Cool, maybe that's like a job that's backing something up. Or it could be a threat actor that's Oh, let me steal as much as possible.

Ashish Rajan: I was going to say, in your research, because what we're talking about is a security perspective, but most technology things, start in the engineering and the technology side.

Yeah. I don't know if you guys have done some by snooping around or investigation onto that side. Cause I feel like if this was something that started in like a DevOps, it started the engineering technology side. How hard would it be for them to say, actually, I wonder if I can just add a check for security.

So Caleb doesn't bother me anymore. A couple of things come to mind.

Clint Gibler: So actually how I led into the fuzzing section in the talk is I was like, here's a paper for Meta about using LLM to automatically generate unit test code. Okay.

Ashish Rajan: Yeah.

Clint Gibler: writing unit tests a pain. There's tons of open source projects, even some companies that are like, don't worry about it, we'll generate unit tests for you.

That's basically the same [00:31:00] thing as fuzzing, with 2 percent difference, right? That's one thing. Another thing you talked about, that I also talked about, that I think is interesting, that I would encourage people to think about is, What are the general things that LMs are good at that can also be applied in a security space?

So specifically, let's say you have lots of Discord or Telegram discussions from some sort of APT or cyber criminal group, or you have access to a dark web marketplace and people are chatting in the forums and you want to know okay, what are they talking about? What are they claiming? What do they do?

Guess what? That's a summarization problem. That's exactly the same as Otter, Zoom, or anything you talked about in terms of you have a bunch of text about people talking about it, planning things,

Caleb Sima: let's summarize it. The thing that are the main points. What are the key points? And you're like,

Clint Gibler: oh, leaders said take over their DNS, and we got access to here.

You can imagine that being pages and pages of like text between, APTs, cybercriminals, and just okay, cool. We got the meeting notes for that. So I thought that was a very cool LLMs are good at this. Oh, by the way, when applied in a security domain, it's like still useful.

Caleb Sima: Which again, also doable today.

Yeah, [00:32:00] 100 percent doable.

Clint Gibler: People are already doing it.

Caleb Sima: That is the kind of thing that I think is practical things that you can do. I would also think about a little bit about tomorrow as well, which is, I also feel like, one, engineers in general, Should never, and I, again, never quote unquote care about security.

I just don't as a security person, I have a hard enough time understanding security. I just don't think it's something that will be at the top of mind of every engineer. Except for Microsoft. Okay, so taking that as an aspect, okay, when I write my code, let's assume in the future you're writing a lot of requirements and everything's going to be very highly requirement driven, very highly test driven.

Okay. Then code is being created and put together. I think it's easy now for security requirements to be analyzed and embedded in these requirement documentations at which these things then generate the code to keep those. Everything goes earlier in that cycle. Yeah. But like even when I said, I actually said in my presentation, one of the things I think will also happen in engineering is [00:33:00] you have self documenting code.

Yeah. Which means, antime an engineer writes or anything changes in code, I actually feel like there is a quote unquote, you can think of as like a manual or a wiki page of what is this app? What does it do? How does it work? What happens with it? I actually think that thing could be auto created, written by LLMs so that actually you could read this and say, Oh, these are things that are security centric, these things should be modified and changed, and then requirements are then rewritten.

I think all that stuff is very doable. Great.

Are we saying Waterfall is coming back? Yeah. . .

Ashish Rajan: Like I find like AI powered Waterfall. Anyone.

Caleb Sima: Requirements as code was trademark. Yeah, I was gonna say, yeah.

Like Waterfall as code maybe. That was stick.

I was having a conversation with some people after my talk about this.

And the spicy thing I'm supposed to post is what's agile but not just a shorter form of waterfall.

Clint Gibler: I think that's a really powerful idea and it also makes me think of that sort of joke like, bro, the code is the docs.

Caleb Sima: Yeah. It's actually true, isn't it? Yeah. And I [00:34:00] think it's just, it's a very viable and easy thing to do. Let me say, Not easy, but it seems feasible. It's feasible.

Yeah, it's a tractable problem.

Ashish Rajan: I feel like the jobs of project managers in that world of tomorrow would be to ask the question to one of those Amazon Q or whatever to say, Hey, what's the progress of this project instead of having standups? That's why I was talking about the whole waterfall thing as well.

I feel in a way, we could still have elements of Agile to have a showcase of what we're doing or whatever, but primarily there's no need for a standup because you can just ask, Hey, what module was added by Ashish yesterday? Is that working? Not working? Is the ticket going to move? Like all those questions could be coming more from the engineering side as well.

The project managers, the engineers, anyone wants to progress of how quickly can we do this? Oh we had the timeline set for six months, but because Ashish has been a shitty coder, spending not too much time on it, now it's going to be like seven months or whatever.

Caleb Sima: I, you saw on mine, I predicted the end of status reports.

Oh, yeah. Yeah. That's what I really want. Yeah.

Ashish Rajan: That would be [00:35:00] interesting. And I wonder how many people actually have a full time job just doing that. They will be worried.

Caleb Sima: No, they're just now going to be focused on doing things that matter. I gave an example in my presentation of my least privileged dashboard, right?

Which is, these are the assets that we consider to be high risk, and all I want is an update on how least privileged is on these assets, which is consistent, very simplistically, two things. How many accounts are in each asset? Are they going down? And of the accounts that are on there, are their privileges going down?

You should have an AWS or a cloud or whatever it is, Oracle, you can write the right prompt that basically says, here are the assets, I want to know these permissions, and I just want you to output them, and then format it in this way, and then boom, you automatically get a dashboard, done. Every time you need an update, you rerun your prompt, and it will just write it up for you.

No one needs to go run around, Check to see what the engineers are doing. None of this stuff needs to be. I think you were right on the your talk would be taken as almost like a road map. It is a road map. Yeah, because I feel like [00:36:00] it ticks a lot of the boxes for what people are doing with the whole CSPM and all of that.

Ashish Rajan: It definitely would take them out of the way. I would definitely recommend it.

Caleb Sima: If you're a vendor or a startup or a founder,

Ashish Rajan: it's a road map. Yeah. It is coming. It is. I'm sure your talk is also the same.

Clint Gibler: Are you saying if you watch this talk, you agree to give me 1 percent of any company that you've got,

Caleb Sima: it's a 10 percent minimum.

Oh, okay. Come on, Clint. Let's like, if I come up with the idea, ideas are hard, right? In the AI space.

Ashish Rajan: Last question before we have to do this last question before that we are almost on time. Mine was like, I don't know what mine was good. Mine was good. Yeah. Did you pick them? Yeah. . You guys did the good one. I'm like, oh my God. It's a weird taste in my mouth. What's the tldr of what today, two years, three years. We're obviously right next to RSA. Is there anything you're predicting is going to come out of it? Because a lot of people are being secretive as well, some of them.

What's your prediction?

Clint Gibler: A lot of the things I see seem very much like prototypes, like somebody is like, hey, I had an afternoon or a weekend and they hacked this thing [00:37:00] together. I think we're gonna see a lot more like mature things that have had some serious engineering go into it. Yeah, and that's just a meta observation from the things I've read.

We see a lot of just single prompt things where it's like prompt plus some things. Give me the output, but not as much agents and tools and multi step workflow. I think we're going to see a lot more infrastructure and complexity around the thing, which I think is going to let existing models perform orders of magnitude better.

So even models are the same, but I think the glue around it. We'll just make the overall outcomes like 10x what they are today. There's certain applications that it seems models are not good at today. For example, finding vulnerabilities in source code. There's a paper I referenced in the talk that found that, at least in their benchmark that they created, GPT 4's performance was akin to basically guessing, which is not good.

Caleb Sima: Yeah, yeah, these models are predictors, and finding vulnerabilities, I feel, are actually finding things at which they are least occurring.

Clint Gibler: Yeah, they're familiar with like patterns, But do they really understand what the implication of the code is? It's unclear. I think models are going to keep getting better.

One thing that's changed much faster [00:38:00] than I expected is context window size. For example, Gemini, it's like 1 million and 10 million in their like research stuff. Not that long ago it was like 4K or 8K was huge, right? That's like many orders of magnitude bigger. So that's a huge change. What we can do with that much more context is interesting.

I think that, People will start thinking of AI more as like a tool to solve problems rather than just throw AI at everything. I think that we'll have, within, let's say, two years, one of the predictions in the talk is I think we'll have AI pen testing bots, which, within a narrow scope, can perform as well as entry level pen testers.

Let's say, college new grad, one year of experience, I think an AI bot would be that good at web app pen testing within, let's say two or three years. I don't know if they'll ever be as good as say, Project Zero. That's But I think entry level people, they're going to be, AI systems will be that good soon.

I think sooner than we expect.

Ashish Rajan: OWASP Top 10 really hasn't evolved that dramatically in what, since it's existed. The Top 10 has primarily remained the same. Even if you take Top 3 or Top 4 out of it, you're still like, oh, that's pretty good. If it can do that both for internal applications and external applications.

But no, good prediction.

Clint Gibler: But yeah, so I think we're going to see, Agents and tool [00:39:00] based systems autonomously handling some things with probably a human in the loop.

Caleb Sima: My prediction. You can just go see my talk because it's all, I feel like in general. The industry is looking at AI one. It's very overhyped. I think people are treating it like artificial super intelligence where we haven't even hit artificial general intelligence yet.

And people are thinking that it's it's going to do these things that are super amazing. But in reality, they just don't. And so I, my prediction is, I think in the next year, we're going to see a trough of disillusionment with AI. I think you're going to see a lot of these money pushes projects, A lot of people are trying to duct tape things on the side of AI and we're going to see it just not work and people are going to go, Oh, my expectations were here and it's going to fall here and people are going to start like dismissing these things. However, similar to what Clint said, I think then what's going to happen is there's this turn of people who are building true, I think AI at its core and understanding AI and it's [00:40:00] not just about duct taping things and expecting amazing results.

Yeah, that in the next then following year after that, you're going to start seeing these things emerge and they are going to start doing some pretty amazing things. It's I think Clint was right. It's not just about the models themselves improving, but it's the glue and the systems of how to manage those models better.

That's really going to make the big difference. And that takes time. It takes understanding. It takes, doing it at its core, and I think where the things that we're talking about in our predictions start becoming more reliable, more trustworthy is in a year and a half, two years, when I think the trough happens and these new people coming out are going to start using it.

Clint Gibler: Yeah, and I think a year and a half in AI time is five or ten years in time.

Ashish Rajan: Yeah. To what you were talking about, disconnect between Jira, Slack that may still remain in two years, you reckon? Or, and obviously we're going into a layer deeper because for me to get an answer for, hey, how is Ashish doing as a programmer on the project?

Or is my project still running on the same timeline as I was expecting it to be? That would [00:41:00] require a lot more partnership between all of these other today Canva has AI, Jira has AI, your productivity tools have AI, Google has AI, all of them have, but they're all isolated. To what you were saying in your talk as well.

Do you mean about Oracles? Yeah, the Oracle agents. Yeah. Do you reckon, is that going to be next two years as well?

Caleb Sima: Yeah, I don't think you're going to get really good results out of an Oracle agent this year at all. I think it'll be end of next year before I feel like you're going to get anything that is reasonably good for any of the services.

Minus, of course, some fact of which GPT 5 or whatever comes out. Yeah. It's like infinite context windows then I think, like maybe there's some fast changes that happen, but in every industry from mobile to cloud to others, I think the real impactful changes, although it feels simple at first, never really come until later down the road.

You don't know how fast these things, cloud, you don't really see, yeah, cloud made a big difference, but you don't see the really impactful things in cloud until 3, 4 years past. [00:42:00] I would even argue 6, 7 years later as well. Yeah, exactly the same. So I think, I don't see why AI would be any different. Similarly there is obviously a steep thing, but even though the technology might not be available, I think the understanding of how to use it effectively, properly, in a scalable way manageable way still takes time.

Clint Gibler: Actually productionizing it versus I was going to say, I'd like a one off joke that I wanted to fit in.

Ashish Rajan: You should definitely put it in.

Clint Gibler: Caleb, I really enjoyed your talk, but I heard there were a couple of people that were so bored. They left 10 minutes in.

Were you

Ashish Rajan: hurt?

Caleb Sima: Because my talks are that boring. Yes. Clint. My kids yeah, so Clint is talking about my kids. I brought my two kids who are six and four and my wife in my talk and I only brought them because one, I don't think my kids think I do anything like, they see my wife who's a chef and she's cooking and doing, and they're like, dad doesn't do anything.

So I was like, [00:43:00] okay, fine. You can actually come and see me speak in front of an audience. And actually I did put a timer. I was like. I feel like maybe they'll make it 10 minutes in, and I was pretty much spot on. Yeah.

Ashish Rajan: A hundred percent. On that high note, I think we can call it a close. Any final thoughts that you want to share?

Obviously we'll link your talk as well as your talk on the episode as well.

Clint Gibler: There's a lot of people who probably totally dismiss AI because it's so overhyped or it feels that way. And I felt that way for a long time. Are you saying it's not NFT? Yeah, I think it's unlike NFT is it solves the real problem.

Or cryptocurrencies in general. But I think what I tried to do in my talk is be like very realistic about today, here's what it can do well, here's what it's bad at. And my argument is it's a tool like any other and can help you with some things. It's not going to solve everything. But like with many applications, it's only getting better over time, for atleast the next few years check it out. If it can help you with things, that's great. If not, that's fine. If you want to see the latest and greatest things, feel free to check out tldrsec.com totally free newsletter. And yeah, thanks so much for having me. It's been a pleasure. No, thank you.

Ashish Rajan: [00:44:00] Thanks so much for coming on the show and definitely follow the newsletter as well.

Thank you so much for listening to that episode of AI cybersecurity podcast. If you are wondering why aren't we covering all topics, because maybe the field is evolving too much, too quickly. So we may not even know some of the topics we have not covered. If you know of a topic that we should cover on AI cybersecurity podcast, or someone we should bring as a guest, definitely emails on info at cloud security podcast or TV, Which reminds me, we have a sister podcast called Cloud Security Podcast, where we talk about everything cloud security with leaders, similar to the AI cybersecurity conversation.

We focus on cloud security specifically in the public cloud environment at CloudSecurityPodcast. TV, which if you find helpful, definitely check out www. CloudSecurityPodcast. TV. Otherwise, I will look forward to seeing you on the next episode of AI Cybersecurity Podcast. Have a great one. Peace.

‍

No items found.

The Future of Digital Identity: Fighting AI Deepfakes & Identity Fraud

The Truth Behind AI Agents: Hype vs. Reality

How AI is changing Detection Engineering & SOC Operations?

What does your AI cybersecurity plan look like for 2025?

AI Cybersecurity Predictions 2025: Revolution or Reality?

The Future of Digital Identity: Fighting AI Deepfakes & Identity Fraud

The Truth Behind AI Agents: Hype vs. Reality

How AI is changing Detection Engineering & SOC Operations?

What does your AI cybersecurity plan look like for 2025?

AI Cybersecurity Predictions 2025: Revolution or Reality?

AI Red Teaming in 2024 and Beyond

The Current State of AI and the Future for CyberSecurity in 2024

What is AI Native Security?

BlackHat USA 2024 AI Cybersecurity Highlights

Our insights from Google's AI Misuse Report

AI Code Generation - Security Risks and Opportunities

Exploring Top AI Security Frameworks

AI Highlights from RSAC 2024 and BSides SF 2024

How AI can be used in Cybersecurity Operations?

The Evolution of Pentesting with AI

AI's role in Security Operation Automation

Where is the Balance Between AI Innovation and Security?

Breaking Down AI's Impact on Cybersecurity

Innovating Security Practices with AI

How are LLMs deployed in enterprise | AI Explained