Synthetic intelligence (AI) agency Anthropic has rolled out a instrument to detect discuss nuclear weapons, the corporate mentioned in a Thursday weblog put up.
“Nuclear technology is inherently dual-use: the same physics principles that power nuclear reactors can be misused for weapons development. As AI models become more capable, we need to keep a close eye on whether they can provide users with dangerous technical knowledge in ways that could threaten national security,” Anthropic mentioned within the weblog put up.
“Information relating to nuclear weapons is particularly sensitive, which makes evaluating these risks challenging for a private company acting alone,” the weblog put up continued. “That’s why last April we partnered with the U.S. Department of Energy (DOE)’s National Nuclear Security Administration (NNSA) to assess our models for nuclear proliferation risks and continue to work with them on these evaluations.”
Anthropic mentioned within the weblog put up that it was “going beyond assessing risk to build the tools needed to monitor for it,” including that the agency made “an AI system that automatically categorizes content” known as a “classifier” alongside the DOE and NNSA.
The system, in accordance with the weblog put up, “distinguishes between concerning and benign nuclear-related conversations with 96% accuracy in preliminary testing.”
The agency additionally mentioned the classifier has been used on visitors for its personal AI mannequin Claude “as part of our broader system for identifying misuse of our models.”
“Early deployment data suggests the classifier works well with real Claude conversations,” Anthropic added.
Anthropic additionally introduced earlier this month it could supply Claude to each federal authorities department for $1 within the wake of the same OpenAI transfer a number of weeks in the past. In a weblog put up, Anthropic mentioned federal businesses would achieve entry to 2 variations of Claude.