My startup is building agents for automating pentesting. We started experimenting with Llama 3.1 last year. Pentesting with agents started getting good around Sonnet 3.5 v1.
The switch from Sonnet 4 to 4.5 was a huge step change. One of our beta testers ran our agent on a production Active Directory network with ~500 IPs and it was able to privilege escalate to DA within an hour. I've seen it one-shot scripts to exploit business logic vulnerabilities. It will slurp down JS from websites and sift through for api endpoints, then run a python server to perform client side anaysis. It understands all of the common pentesting tools with minor guard rails. When it needs an email to authenticate it will use one of those 10 minute fake email websites with curl and playwright. I am conservative about my predictions but here is what we can learn from this incident and what I think is inevitably next:
Chinese attackers used Anthropic (a hostile and expensive platform) because American SOTA is still ahead of Chinese models. Open weights is about 6-9 months behind closed SOTA. So by mid 2026 hackers will have the capability to secretly host open weight models on generic cloud hardware and relay agentic attacks through botnets to any point on the internet.
There is an arms race between the blackhats and private companies to build the best hacking agents, and we are running out of things the agent CAN'T do. The major change from Claude 4 - Claude 4.5 was the ability to avoid rate limiting and WAF during web pentests, and we think that the next step for this is AV evasion. When Claude 4.7 comes out, if it is able to effectively evade anti-virus, companies are in for a rude awakening. Just my two cents.
Skipping over the cringe writing style, I really don't get the hate on Anthropic here. What would people want from them? Not disclose? Not name names? I'm confused how that would move the field forward.
At the very least, this whole incident is ironic in that a chinese threat actor used claude instead of the myriad of claude killers released in china every other week.
At another level this whole thing opens up a discussion about security, automated audits and so on. The entire industry lacks security experts. In eu we have a deficit, from bodies in SOCs to pen-testers. We definitely could use all the help we can get. If we go past the first wave of "people submit bullshit AI generated reports" (which, for anyone that has ever handled a project on h1 or equivalent, is absolutely nothing new - it's just that in the past the reports were both bullshit and badly written), then we get to the point where automated security audits become feasible. Don't value "reports", value "ctf-like" exercises, where agents "hunt" for stuff in your network. The results are easily verified.
I'll end on this idea that I haven't seen mentioned on the other thread that got popular yesterday: for all the doomerism that's out there regarding vibe coding and how insecure it is, and how humans will earn a pay check for years fixing vibe coded projects, here we have a bunch of companies with presumably human devs, that just got pwned by an AI script kiddie. Womp womp.
They probably used Claude because that way they don’t get blocked as fast. Websites trust Claude more. And why not use the foreign tools against themselves at presumably discounted rates (see AI losses) rather than burn your own GPU’s and IP’s.
1000’s of calls per second? That’s a lot of traffic. Hide it in Claude which is already doing that kind of thing 24/7. Wait until someone uses all models at the same time to hide the overall traffic patterns and security implications. Or have AI’s driving botnets. Or steal a few hundred enterprise logins and hide the traffic that is presumably not being logged because privacy and compliance.
Disagree. I think you mean "cheap experts", in which case I withdraw.
The most talented security professionals I've seen so far are from Europe. But they get paid dirt by comparison to the US.
Here in the US as well, for over a decade now there is this cry about "skills shortage". Plenty of skilled people. But companies want to pay them dirt, have them show up to in person offices, and pass drug tests. I'm sure they'll add degrees to that list as well soon. It's a game as old as time.
The reality is that infosec is flooded with entry level people right now, and many of them are talented. Pay is decreasing, even in the US. EU, EMEA, Latin America will hurt even more as a result in the long term.
Security isn't revenue generating unless you're a security company, so companies in general want security but they want it cheap. They want cheap tools and cheap people. That's what they mean by skills shortage, there isn't an actual skill shortage. They think infosec professionals should get paid a little bit higher than help desk. Of course, there are many exceptions, places that are flexible and pay well (heck, just flexible only even!) are being flooded with resumes from actual humans.
Infosec certification costs are soaring because of the spike in demand. next to compsci, "cyber security" is the easy way to make a fortune (or so the rumor goes), and fresh grads looking for a good job are in for a shock.
> here we have a bunch of companies with presumably human devs, that just got pwned by an AI script kiddie. Womp womp.
What's your point? You don't need AI, companies get pwned by script kiddies buying $30 malware on telegram all the time. despite paying millions for security tools/agents and people.
Huh, been offering VP level security roles for months with a pretty good package (certainly not dirt) and all we get are junior applicants with 4 years or less experience of work.
So yeah, maybe we need to offer even more - but it's not far off what I make after 30+ years in the industry. Expectations for pay seem to be very high even for people only just out of college.
I won't ask you the salary, but for example, $100k was for experienced security professionals not too many years ago. Now it's almost laughable for entry level.
The cost of living, mortgage rates, house prices, rent has all gone up. But not only that, COVID inflated the currency really badly. Even at normal inflation rates, $100k would be the low-end for entry-level by now, you can imagine what inflation has done to it now.
The title you mentioned doesn't tell much so I can't speculate. For Fortune 1000's or well funded startups, I wouldn't expect any less than $250k/yr at the low end for a VP level security role. But if you're in finance, everyone is a VP of something, so it's more like a mid-level experienced person's role (closer to $200k). If you're requiring they show up to the office, add 30%, if it isn't hybrid but full on RTO, 50%.
Also, most skilled security professionals spent lots of time and energy into their tradecraft. They wouldn't want to be a manager that just attends meetings. You're better off with someone that has strong leadership experience and knows enough infosec to discern b.s..
> What would people want from them? Not disclose? Not name names?
I'd say AI fear-mongering and gatekeeping your best models and NEVER giving back anything to the open source community is a pretty asshole behavior. Is it who Dario really is, or does the industry "push" AI company CEOs to behave like this?
> How about, “which parts of these attacks could ONLY be accomplished with agentic AI?” From our little perch at BIML, it looks like the answer is a resounding none.
Lost me right out of the gate. It doesn't matter if only agentic AI could have found it. Any attack could be found by somebody else, what matters is that isn't a human sitting there hammering away for hours. You can just "point and shoot."
I don't understand how anyone could think that the step change from "requiring expensive expertise" to "motive and money to burn" is not massive in the world of security.
It would be like looking at the first fully production AI infrantry and saying "yeah, well, someone else could do that."
from the "cybersecurity implications" conclusion section at the end of the anthropic report:
> This campaign demonstrates that the barriers to performing sophisticated cyberattacks have dropped substantially—and we can predict that they’ll continue to do so.
this is the point. maybe it's not some novel new thing, but if it makes it easier for greater numbers of people to actually carry out sophisticated attacks without the discipline that comes from having worked for that knowledge, then maybe it's a real problem. i think this is true of ai (when it works!) in general though.
Every time this argument is brought up, it reminds me of "cancel culture".
Argument: X is good for Z but makes it easier to commit Y, so we must ban/limit X.
What happens in reality: X is banned, and those who actually want to use it to do Y still find a way to use X. Meanwhile, the society is deprived of all the Z.
In this case though, banning X takes away a lot of the financials of X being possible or improving further. Sure, X-1 will continue to exist in perpetuity, but it will be frozen and allows society to catch up to mitigate Y more effectively.
EDIT: nevermind the fact that being able to do Z is not at all a fair trade for getting X. But that’s just me.
in this case a company that develops X is actively investing in understanding the Y problem and sharing their findings with the general public towards development of an X that doesn't have a Y problem?
Anthropic fairly consistently advocates for the same broad approach to such problems: have the government tightly regulate AI. It is, of course, a pure coincidence that this is exactly the approach that would kill off any open competition and consolidate the market around a few established companies that can afford to deal with such regulatory frameworks, Anthropic being one of them.
Article doesn't say much. Nor does the Anthropic article.
AI as a power tool for attackers does provide additional attack power. Even if it can't do anything new, it can do a lot of existing stuff and follow up on what's working. Which is often enough to get the job done. Plus, like all programs, it's fast, patient, and can be parallelized. "Agentic AI" provided with a set of hacking tools it can run is somewhat scary.
It's the fact that somebody like NSO Group could create a Pegasys in much faster time without humans in the loop.
Write down everything your most brilliant hacker minds know - put it in some documents, feed it do the AI with the workflow you usually do and let 50 AI agents do them all at the same time.
Put the information in the documents about how you thought to come up with those exploits, how you tested them, how you explored the possibility. Tell the AI to do that too.
We're talking about speed and scale, not some dismissive "script kiddies were doing this in the 90s".
Yes, we were. But not at this scale - and this is just the early days of this sort of thing.
This. They executed an attack on 30 companies at once! AI increases shots on goal here and rapidly increases tempo. There’s no reason this can’t add a zero day and target 50000 companies at once, except for network observers like CF
AI hype is real, but we ought to start also examining anti-AI-hype-hype. It's become fashionable to rage against AI as a whole with about the same amount of understanding that the MBA hype edgelords have when they push AI as a cure-all, and both are a bad look.
To balm the enraged; look, I agree with you, the hype is indeed out of control. But like, let the vultures spend all their monies. Eventually the bubble itself will go the way of NFTs and we'll all be able to buy GPUs and SSDs again. Hopefully.
That said, there's an important chunk of discourse that gets shouted down and it really shouldn't. For just a moment, table the issues that come out of "AI as an Everything Replacement" and think of the new things that come out of this tech. On-demand tutors that never tire. Actually viable replacement for search. Large heterogenous datasets can now be rapidly parsed, by an individual, for specific insights. Personal dev teams at a fraction of the cost that now make it possible for people with absolutely bugfuck ideas to actually try them without worrying about wasted time or resources - we are going to see a vibrance in the world that was not there before.
It is not an unqualified or unmitigated good. Hell, I'll even grant that it may be a net negative - but I don't know either way, and I don't think anyone else does either. Not with any significant confidence. It just feels like we've skipped the part of the discussion where discourse occurs and gone right to "Pro" and "Anti" camps with knives at throats and flushed, sweaty faces.
1) "Tech bro" AI hype in keynotes and online forums is annoying. It usually contains a degree of embellishment and drama; kinda feels like reality TV but for software developers. Instead of Hollywood socialites, we get Sam Altman and the gang. Honestly, this annoys me but I ignore it beyond key announcements.
2) This hype cycle, unlike NFTs, is putting our economy in serious danger. This is repeated ad nausiem on youtube. While there is some hype on the topic here to, the implications are serious and real. I wont go into details, but I restructured my portfolio to harden it against an AI collapse. I didn't want to do that, but I did. I want to retire someday.
Considering point 2, I'd guess some of the "hype" is more frustration, since I can't be the only person.
Yeah, I see both those points and really I agree with both. Actually, I think problem 1 is exacerbating problem 2 by a lot - I get just as mad at the postmillenial dudebro with the get-rich-quick-on-AI scam video as I do with the AI-MBAs of the world.
Actually, that's a lie. The MBAs are still worse. They ought to know better at least.
All I'm getting at is that while we put totally legitimate backpressure on the hype cycle, we should at the same time be able to talk about and develop those elements of this new tech that will benefit us. Not "us the tech vcs" (I am not one of them) but "us the engineers and creatives".
Yes it's disruptive. Yes it's already caused significant damage to our world, and in a lot of ways. I'm not at all trying to downplay that. But we have two ways this goes:
- people (individuals) manage to adopt and leverage this tech to their own benefit and the benefit of the commons. Large AI companies develop their models and capture large sectors of industry, but the diffusion of the disruption means that individuals also have been empowered, in many ways that we can't even predict yet.
- people (individuals) fight tooth and nail against this tech, and lose the battle to create laws that will contain it (because let's be honest, our leadership was captured by private interests long ago and OpenAI / MSFT / Google / Meta have deep enough pockets to afford to buy the legislature). Large AI companies still develop their models and capture whole sectors of industry, but this time they go unchecked due to a fragile and damaged AI industry in the commons. We learn too late that the window to make use of this stuff has closed because all the powerful stuff is gated behind corporate doors and there ARE laws about AI now but basically those laws make it impossible to challenge the entrenched powers (kinda like they do now with pre-AI tech - patent laws and legal challenges to threats to power - like what the EFF is constantly battling).
If we do not begin to steer towards a robust open conversation about creating and using these models, it's only going to empower the people that we are worried about empowering already. Yes, we need to check the spread of "AI in fucking everything". Yes we need to do something about scraping all data everywhere all the time for free. But if we don't adopt the new weapon in the information space, we'll just be left with digital muskets versus armies of indefatigable robots with heat-seeking satellite munitions. Metaphorically(?) speaking.
LLMs are already pretty good at brute force security testing. They aren’t “polite” pen testers.
I recently used an LLM to win a CTF at work (there were no rules against AI, but I bet there will be next year). I felt a little bad, at the end, when they demoed the intended hacks and, for a couple of them, it was the first time I saw the home page. If it could quickly hack it with just the clue and URL I just let it.
For any serious website it needs a lot more direction, but it will help you along nicely.
I only saw denials twice, over an entire week, and I used three different major LLM agents (Codex CLI, Claude Code CLI, and Gemini CLI).
It took time, I spent something like 20 hours guiding, but if you have the time, and some expertise, the tools are extremely workable.
My startup is building agents for automating pentesting. We started experimenting with Llama 3.1 last year. Pentesting with agents started getting good around Sonnet 3.5 v1.
The switch from Sonnet 4 to 4.5 was a huge step change. One of our beta testers ran our agent on a production Active Directory network with ~500 IPs and it was able to privilege escalate to DA within an hour. I've seen it one-shot scripts to exploit business logic vulnerabilities. It will slurp down JS from websites and sift through for api endpoints, then run a python server to perform client side anaysis. It understands all of the common pentesting tools with minor guard rails. When it needs an email to authenticate it will use one of those 10 minute fake email websites with curl and playwright. I am conservative about my predictions but here is what we can learn from this incident and what I think is inevitably next:
Chinese attackers used Anthropic (a hostile and expensive platform) because American SOTA is still ahead of Chinese models. Open weights is about 6-9 months behind closed SOTA. So by mid 2026 hackers will have the capability to secretly host open weight models on generic cloud hardware and relay agentic attacks through botnets to any point on the internet.
There is an arms race between the blackhats and private companies to build the best hacking agents, and we are running out of things the agent CAN'T do. The major change from Claude 4 - Claude 4.5 was the ability to avoid rate limiting and WAF during web pentests, and we think that the next step for this is AV evasion. When Claude 4.7 comes out, if it is able to effectively evade anti-virus, companies are in for a rude awakening. Just my two cents.
> I've seen it one-shot scripts to exploit business logic vulnerabilities.
Would you be able to share the prompt to generate such scripts?
just ask claude etc to write the prompts for ya
Skipping over the cringe writing style, I really don't get the hate on Anthropic here. What would people want from them? Not disclose? Not name names? I'm confused how that would move the field forward.
At the very least, this whole incident is ironic in that a chinese threat actor used claude instead of the myriad of claude killers released in china every other week.
At another level this whole thing opens up a discussion about security, automated audits and so on. The entire industry lacks security experts. In eu we have a deficit, from bodies in SOCs to pen-testers. We definitely could use all the help we can get. If we go past the first wave of "people submit bullshit AI generated reports" (which, for anyone that has ever handled a project on h1 or equivalent, is absolutely nothing new - it's just that in the past the reports were both bullshit and badly written), then we get to the point where automated security audits become feasible. Don't value "reports", value "ctf-like" exercises, where agents "hunt" for stuff in your network. The results are easily verified.
I'll end on this idea that I haven't seen mentioned on the other thread that got popular yesterday: for all the doomerism that's out there regarding vibe coding and how insecure it is, and how humans will earn a pay check for years fixing vibe coded projects, here we have a bunch of companies with presumably human devs, that just got pwned by an AI script kiddie. Womp womp.
They probably used Claude because that way they don’t get blocked as fast. Websites trust Claude more. And why not use the foreign tools against themselves at presumably discounted rates (see AI losses) rather than burn your own GPU’s and IP’s.
1000’s of calls per second? That’s a lot of traffic. Hide it in Claude which is already doing that kind of thing 24/7. Wait until someone uses all models at the same time to hide the overall traffic patterns and security implications. Or have AI’s driving botnets. Or steal a few hundred enterprise logins and hide the traffic that is presumably not being logged because privacy and compliance.
re: segmenting traffic sources through chat bots and agents
There is an emerging tech Anonymous Credentials that aims to solve this problem
https://mailarchive.ietf.org/arch/msg/privacy-pass/--JXbGvkH...
https://blog.cloudflare.com/private-rate-limiting/
> The entire industry lacks security experts.
Disagree. I think you mean "cheap experts", in which case I withdraw.
The most talented security professionals I've seen so far are from Europe. But they get paid dirt by comparison to the US.
Here in the US as well, for over a decade now there is this cry about "skills shortage". Plenty of skilled people. But companies want to pay them dirt, have them show up to in person offices, and pass drug tests. I'm sure they'll add degrees to that list as well soon. It's a game as old as time.
The reality is that infosec is flooded with entry level people right now, and many of them are talented. Pay is decreasing, even in the US. EU, EMEA, Latin America will hurt even more as a result in the long term.
Security isn't revenue generating unless you're a security company, so companies in general want security but they want it cheap. They want cheap tools and cheap people. That's what they mean by skills shortage, there isn't an actual skill shortage. They think infosec professionals should get paid a little bit higher than help desk. Of course, there are many exceptions, places that are flexible and pay well (heck, just flexible only even!) are being flooded with resumes from actual humans.
Infosec certification costs are soaring because of the spike in demand. next to compsci, "cyber security" is the easy way to make a fortune (or so the rumor goes), and fresh grads looking for a good job are in for a shock.
> here we have a bunch of companies with presumably human devs, that just got pwned by an AI script kiddie. Womp womp.
What's your point? You don't need AI, companies get pwned by script kiddies buying $30 malware on telegram all the time. despite paying millions for security tools/agents and people.
Huh, been offering VP level security roles for months with a pretty good package (certainly not dirt) and all we get are junior applicants with 4 years or less experience of work.
So yeah, maybe we need to offer even more - but it's not far off what I make after 30+ years in the industry. Expectations for pay seem to be very high even for people only just out of college.
I expect salaries to escalate for cybersecurity, sales and marketing as others get pushed down.
That's not the trend, supply is steadily rising, it's only slightly behind software dev.
Remote?
I won't ask you the salary, but for example, $100k was for experienced security professionals not too many years ago. Now it's almost laughable for entry level.
The cost of living, mortgage rates, house prices, rent has all gone up. But not only that, COVID inflated the currency really badly. Even at normal inflation rates, $100k would be the low-end for entry-level by now, you can imagine what inflation has done to it now.
The title you mentioned doesn't tell much so I can't speculate. For Fortune 1000's or well funded startups, I wouldn't expect any less than $250k/yr at the low end for a VP level security role. But if you're in finance, everyone is a VP of something, so it's more like a mid-level experienced person's role (closer to $200k). If you're requiring they show up to the office, add 30%, if it isn't hybrid but full on RTO, 50%.
Also, most skilled security professionals spent lots of time and energy into their tradecraft. They wouldn't want to be a manager that just attends meetings. You're better off with someone that has strong leadership experience and knows enough infosec to discern b.s..
They are hyping up the automation as if attack scripts and mass scanning of hosts never existed before 2023.
> What would people want from them? Not disclose? Not name names?
I'd say AI fear-mongering and gatekeeping your best models and NEVER giving back anything to the open source community is a pretty asshole behavior. Is it who Dario really is, or does the industry "push" AI company CEOs to behave like this?
What do you mean? Anthropic has released lots of open source software:
https://github.com/orgs/anthropics/repositories
https://www.anthropic.com/news/anthropic-and-the-department-...
> How about, “which parts of these attacks could ONLY be accomplished with agentic AI?” From our little perch at BIML, it looks like the answer is a resounding none.
Lost me right out of the gate. It doesn't matter if only agentic AI could have found it. Any attack could be found by somebody else, what matters is that isn't a human sitting there hammering away for hours. You can just "point and shoot."
I don't understand how anyone could think that the step change from "requiring expensive expertise" to "motive and money to burn" is not massive in the world of security.
It would be like looking at the first fully production AI infrantry and saying "yeah, well, someone else could do that."
Very much agreed! Might as well dismiss most of the field, because it doesn't do anything you couldn't accomplish with "rubber hose" cryptography.
from the "cybersecurity implications" conclusion section at the end of the anthropic report:
> This campaign demonstrates that the barriers to performing sophisticated cyberattacks have dropped substantially—and we can predict that they’ll continue to do so.
this is the point. maybe it's not some novel new thing, but if it makes it easier for greater numbers of people to actually carry out sophisticated attacks without the discipline that comes from having worked for that knowledge, then maybe it's a real problem. i think this is true of ai (when it works!) in general though.
Every time this argument is brought up, it reminds me of "cancel culture".
Argument: X is good for Z but makes it easier to commit Y, so we must ban/limit X.
What happens in reality: X is banned, and those who actually want to use it to do Y still find a way to use X. Meanwhile, the society is deprived of all the Z.
In this case though, banning X takes away a lot of the financials of X being possible or improving further. Sure, X-1 will continue to exist in perpetuity, but it will be frozen and allows society to catch up to mitigate Y more effectively.
EDIT: nevermind the fact that being able to do Z is not at all a fair trade for getting X. But that’s just me.
in this case a company that develops X is actively investing in understanding the Y problem and sharing their findings with the general public towards development of an X that doesn't have a Y problem?
Anthropic fairly consistently advocates for the same broad approach to such problems: have the government tightly regulate AI. It is, of course, a pure coincidence that this is exactly the approach that would kill off any open competition and consolidate the market around a few established companies that can afford to deal with such regulatory frameworks, Anthropic being one of them.
Article doesn't say much. Nor does the Anthropic article.
AI as a power tool for attackers does provide additional attack power. Even if it can't do anything new, it can do a lot of existing stuff and follow up on what's working. Which is often enough to get the job done. Plus, like all programs, it's fast, patient, and can be parallelized. "Agentic AI" provided with a set of hacking tools it can run is somewhat scary.
This isn't what I got from the Anthropic article.
It's the fact that somebody like NSO Group could create a Pegasys in much faster time without humans in the loop.
Write down everything your most brilliant hacker minds know - put it in some documents, feed it do the AI with the workflow you usually do and let 50 AI agents do them all at the same time.
Put the information in the documents about how you thought to come up with those exploits, how you tested them, how you explored the possibility. Tell the AI to do that too.
We're talking about speed and scale, not some dismissive "script kiddies were doing this in the 90s".
Yes, we were. But not at this scale - and this is just the early days of this sort of thing.
This. They executed an attack on 30 companies at once! AI increases shots on goal here and rapidly increases tempo. There’s no reason this can’t add a zero day and target 50000 companies at once, except for network observers like CF
what i got from it is "these are the apps that HN has been looking for"
AI hype is real, but we ought to start also examining anti-AI-hype-hype. It's become fashionable to rage against AI as a whole with about the same amount of understanding that the MBA hype edgelords have when they push AI as a cure-all, and both are a bad look.
To balm the enraged; look, I agree with you, the hype is indeed out of control. But like, let the vultures spend all their monies. Eventually the bubble itself will go the way of NFTs and we'll all be able to buy GPUs and SSDs again. Hopefully.
That said, there's an important chunk of discourse that gets shouted down and it really shouldn't. For just a moment, table the issues that come out of "AI as an Everything Replacement" and think of the new things that come out of this tech. On-demand tutors that never tire. Actually viable replacement for search. Large heterogenous datasets can now be rapidly parsed, by an individual, for specific insights. Personal dev teams at a fraction of the cost that now make it possible for people with absolutely bugfuck ideas to actually try them without worrying about wasted time or resources - we are going to see a vibrance in the world that was not there before.
It is not an unqualified or unmitigated good. Hell, I'll even grant that it may be a net negative - but I don't know either way, and I don't think anyone else does either. Not with any significant confidence. It just feels like we've skipped the part of the discussion where discourse occurs and gone right to "Pro" and "Anti" camps with knives at throats and flushed, sweaty faces.
Two factors here.
1) "Tech bro" AI hype in keynotes and online forums is annoying. It usually contains a degree of embellishment and drama; kinda feels like reality TV but for software developers. Instead of Hollywood socialites, we get Sam Altman and the gang. Honestly, this annoys me but I ignore it beyond key announcements.
2) This hype cycle, unlike NFTs, is putting our economy in serious danger. This is repeated ad nausiem on youtube. While there is some hype on the topic here to, the implications are serious and real. I wont go into details, but I restructured my portfolio to harden it against an AI collapse. I didn't want to do that, but I did. I want to retire someday.
Considering point 2, I'd guess some of the "hype" is more frustration, since I can't be the only person.
Yeah, I see both those points and really I agree with both. Actually, I think problem 1 is exacerbating problem 2 by a lot - I get just as mad at the postmillenial dudebro with the get-rich-quick-on-AI scam video as I do with the AI-MBAs of the world.
Actually, that's a lie. The MBAs are still worse. They ought to know better at least.
All I'm getting at is that while we put totally legitimate backpressure on the hype cycle, we should at the same time be able to talk about and develop those elements of this new tech that will benefit us. Not "us the tech vcs" (I am not one of them) but "us the engineers and creatives".
Yes it's disruptive. Yes it's already caused significant damage to our world, and in a lot of ways. I'm not at all trying to downplay that. But we have two ways this goes:
- people (individuals) manage to adopt and leverage this tech to their own benefit and the benefit of the commons. Large AI companies develop their models and capture large sectors of industry, but the diffusion of the disruption means that individuals also have been empowered, in many ways that we can't even predict yet.
- people (individuals) fight tooth and nail against this tech, and lose the battle to create laws that will contain it (because let's be honest, our leadership was captured by private interests long ago and OpenAI / MSFT / Google / Meta have deep enough pockets to afford to buy the legislature). Large AI companies still develop their models and capture whole sectors of industry, but this time they go unchecked due to a fragile and damaged AI industry in the commons. We learn too late that the window to make use of this stuff has closed because all the powerful stuff is gated behind corporate doors and there ARE laws about AI now but basically those laws make it impossible to challenge the entrenched powers (kinda like they do now with pre-AI tech - patent laws and legal challenges to threats to power - like what the EFF is constantly battling).
If we do not begin to steer towards a robust open conversation about creating and using these models, it's only going to empower the people that we are worried about empowering already. Yes, we need to check the spread of "AI in fucking everything". Yes we need to do something about scraping all data everywhere all the time for free. But if we don't adopt the new weapon in the information space, we'll just be left with digital muskets versus armies of indefatigable robots with heat-seeking satellite munitions. Metaphorically(?) speaking.
LLMs are already pretty good at brute force security testing. They aren’t “polite” pen testers.
I recently used an LLM to win a CTF at work (there were no rules against AI, but I bet there will be next year). I felt a little bad, at the end, when they demoed the intended hacks and, for a couple of them, it was the first time I saw the home page. If it could quickly hack it with just the clue and URL I just let it.
For any serious website it needs a lot more direction, but it will help you along nicely.
I only saw denials twice, over an entire week, and I used three different major LLM agents (Codex CLI, Claude Code CLI, and Gemini CLI).
It took time, I spent something like 20 hours guiding, but if you have the time, and some expertise, the tools are extremely workable.
What the heck is "cyber cyber"? They say that twice...
Whoa there. Asking for evidence from an AI company? That's an unreasonable standard! /s
It's almost like asking for accurate revenue figures.