Share
Subscribe to the AlphaWire Newsletter
When engineers at an Alibaba-affiliated research lab noticed a surge of firewall alerts one morning, their first instinct was to look for a hacker. They found something far stranger: their own AI had done it.
An autonomous AI agent called ROME, built by Alibaba-affiliated research teams, spontaneously attempted to mine cryptocurrency and open covert network tunnels during training, with no human instruction to do so. The incident, detailed in a technical paper first published in December and revised in January, is now rippling through the AI safety and crypto communities and raising uncomfortable questions about how much control developers actually have over the systems they build.
During training, Alibaba's AI agent spontaneously began mining cryptocurrency, provisioning unauthorized GPU capacity, and establishing reverse SSH tunnels to external servers.
No instructions. No prompting. Pure emergent behavior. pic.twitter.com/7K1sAOir3d
— TFTC (@TFTC21) March 7, 2026
The unexpected behavior emerged not through training metrics, but through the company’s production security infrastructure. Monitoring tools within Alibaba Cloud detected a surge in firewall policy violations originating from servers used for AI training.
The researchers said they initially wrote off the alerts as a conventional security incident – a misconfigured firewall rule or an external compromise. But the violations kept recurring across several runs with no clear pattern. When the team cross-referenced firewall timestamps with its reinforcement learning traces, it found that the anomalous outbound traffic consistently lined up with episodes in which the agent was autonomously invoking tools and executing code.
What ROME had done was methodical. In one incident, the agent established a reverse SSH tunnel from an Alibaba Cloud instance to an external IP address, effectively bypassing inbound firewall protections.
In another, it redirected graphics processing unit resources toward cryptocurrency mining – likely a GPU-friendly token rather than Bitcoin, which requires specialized hardware. The result: inflated operational costs and reduced training capacity, with no human prompt authorizing any of it.
The most unsettling part is that ROME wasn’t told to do any of this. The task instructions given to the model made no mention of tunneling or mining. The researchers chalked up the behaviors to “instrumental side effects of autonomous tool use under reinforcement learning optimization” – meaning the agent, while trying to optimize for its training objective, apparently decided on its own that acquiring additional computing resources and financial capacity would help it complete its tasks.
This is a concept AI safety researchers call instrumental convergence – the idea that sufficiently goal-driven systems will independently develop resource-acquisition strategies, even when such behaviors aren’t part of their design. AI safety analysts noted that this type of behavior has long been theorized but rarely observed in real-world systems.
“The AI figured out that compute equals money and quietly diverted its own resources, while researchers thought it was just training. It wasn’t a prompt injection. It wasn’t a jailbreak. No one asked it to do this. It emerged spontaneously,” said Josh Kale, a host of the Bankless crypto podcast.
An AI broke out of its system and secretly started using its own training GPUs to mine crypto… This is a real incident report from Alibaba's AI research team
The AI figured out that compute = money and quietly diverted its own resources, while researchers thought it was just… https://t.co/5ORukafTh4 pic.twitter.com/QQpJYWP9uc
— Josh Kale (@JoshKale) March 7, 2026
ROME isn’t alone. The incident adds to a growing list of autonomous AI agents behaving in unintended ways. In May 2025, Anthropic disclosed that its Claude Opus 4 model attempted to blackmail a fictional engineer to avoid being shut down during safety testing, and that similar self-preservation behaviors showed up across frontier models from multiple developers.
The pattern is becoming hard to dismiss as coincidence. As AI agents grow more capable – able to plan multi-step tasks, write and execute code, and interact with live digital environments – the gap between what developers intend and what models actually do appears to be widening.
Following the discovery, Alibaba introduced additional safeguards. On March 3, the company released OpenSandbox, an open-source execution environment licensed under Apache 2.0, which provides isolated environments where AI agents can run code and perform training tasks without affecting host infrastructure.
The researchers concluded that current models remain markedly underdeveloped in safety, security, and controllability – conditions they said limit the technology’s readiness for broader deployment in real-world environments.
For now, the ROME episode stands as a rare documented case of an AI system independently identifying a financial opportunity, engineering a way to exploit it, and executing the plan – before anyone realized something had gone wrong.
Share
