Researchers gaslit Claude into giving instructions to build explosives

sanitation@lemmy.radio · 6 days ago

Researchers gaslit Claude into giving instructions to build explosives

Krompus@lemmy.world · 5 days ago

You are likely to be eaten by a grue.

badgermurphy@lemmy.world · 3 days ago

That has more to do with the darkness than his LLM use.

BodilessGaze@sh.itjust.works · 5 days ago

Interestingly, LLMs are horrible at Zork: https://arxiv.org/abs/2602.15867

Our results reveal that all tested models achieve less than 10% completion on average, with even the best-performing model (Claude Opus 4.5) reaching only approximately 75 out of 350 possible points