Anthropic launched one a few months ago relationship It details how Claude’s AI model was weaponized as part of a “vibe hacking” blackmail scheme. The company continues to see artificial intelligence agents being used to coordinate cyber attacks. Affirmations that a state-backed Chinese hacker group used Claude to attempt to infiltrate thirty commercial and political targets around the world, with some success.
Calling it “the first documented case of a large-scale cyberattack carried out without significant human intervention,” Anthropic said the hackers initially selected their targets, including technology companies, financial institutions and unnamed government agencies. They then used Claude Code to develop an automated attack framework after successfully bypassing model training to prevent malicious behavior. This was achieved by breaking down the planned attack into smaller tasks that did not clearly reveal the wider malicious intent, and by telling Claude that it was a cyber security company using artificial intelligence for defensive training purposes.
After Anthropic wrote his own exploit code, Claude managed to steal usernames and passwords that allowed him to extract “a large amount of private data” through backdoors he created. The obedient AI would even go to the trouble of documenting the attacks and storing the stolen data in separate files.
The hackers used AI for 80-90% of their activities and intervened only occasionally, and Claude was able to stage an attack in much less time than humans. It wasn’t perfect (some of the information obtained was publicly available), but Anthropic said such attacks are likely to become more sophisticated and effective over time.
You might wonder why an AI company would want to tout the dangerous potential of its technology, but Anthropic says its research also shows why the assistant is “essential” for cyber defense. He said Claude has been successfully used to analyze the threat level of collected data and ultimately sees it as a tool that could help cybersecurity professionals with future attacks.
Claude is far from the only AI that cybercriminals have exploited. Last year, OpenAI said its generative AI tools were being used by hacking groups linked to China and North Korea. They allegedly used GAI to debug code, scan potential targets and write phishing emails. OpenAI said at the time that it had blocked the groups from accessing its systems.