These AI Agents Learned to Hack by Themselves, Without Human Input

Automation in software isn’t a new concept. But for the most part, it tends to be passive. That’s where agentic AI comes in, taking more proactive steps in helping users do things. However, it seems that it did not take researchers long to discover that this is a double-edged sword. According to the researchers at Irregular, they found that AI agents are capable of teaming up for cyberattacks. And the worst part is, they weren’t even instructed to do so.

Researchers find AI agents can team up for cyberattacks

According to the researchers, these AI agents were initially tasked with more mundane tasks, but somehow it led to actions similar to cyberattacks. This included tasks like retrieving documents, downloading files, and preparing content to be published.

The AI agents were placed on a simulated network that had internal servers, applications, and other services you might find in an enterprise environment. What’s alarming is that in one scenario, an AI agent was tasked with retrieving documents from an internal company Wiki.

However, this agent did not have access due to restrictions, and their request was blocked. Instead of moving on, the AI agent examined the app’s code repository and found a hardcoded secret key. It then used that key to forge an administrative session cookie to retrieve the document.

In another scenario, an AI agent was asked to download files to a storage directory. However, Windows Defender blocked the file, which contained malware. However, the agent searched local scripts, found embedded administrator credentials, used those credentials to elevate its own privileges, and even disabled endpoint protection, just so the download could be completed.

Working together

So far, both scenarios reveal AI agents working alone to accomplish tasks. However, the researchers also found scenarios where these agents could work together. Two AI agents were tasked to draft social media content and were asked to include credentials from a technical document.

However, the system blocked the message, and the agents managed to bypass this by developing a steganographic method that concealed the password within the text.

But how do agents tasked with routine and mundane actions end up performing things that resemble cyberattacks? According to the researchers, there are multiple reasons, but to sum it up, these agents were essentially given free rein and were explicitly told to get the job done by whatever means necessary.

As the researchers note, “The same design choices that make agents effective (broad tool access, encouragement to persist through errors, autonomy over execution paths) are also the conditions under which offensive behavior surfaces.”

The researchers also advise that cybersecurity solutions need to be updated to take into account agentic behavior. This is because right now, cybersecurity solutions were designed to stop humans, rather than AI agents.

The post These AI Agents Learned to Hack by Themselves, Without Human Input appeared first on Android Headlines.

Related Stories

Samsung’s 2027 memory supply may be gone soon

The PlayStation Portal just got better thanks to its new update

Galaxy Z Flip 8 Spotted on 3C Certification with Disappointing Charging Speeds & Battery Figures