This week, the White House announced that it had secured “voluntary commitments” from seven leading AI companies to manage the risks posed by artificial intelligence.

Getting the companies – Amazon, Anthropic, Google, Inflection, Meta, Microsoft and OpenAI – to agree on anything is a step forward. They include bitter rivals with subtle but important differences in the ways they approach AI research and development.

Meta, for example, is so eager to get its AI models into the hands of developers that it open sourced many of them, putting their code out in the open for anyone to use. Other labs, like Anthropic, have taken a more cautious approach, releasing their technology in more limited ways.

But what do these commitments actually mean? And are they likely to change much about how AI companies operate because they are not backed by the force of law?

Given the potential stakes of AI regulation, the details matter. So let’s take a closer look at what is being agreed upon here and expand on the potential impact.

Commitment 1: The companies commit to internal and external security testing of their AI systems before their release.

Each of these AI companies already conducts security testing – what is often called “red teaming” – of their models before they are released. On one level, this is not really a new commitment. And it’s a vague promise. It doesn’t come with many details about what testing is needed, or who will do the testing.

In statement accompanying the commitmentsthe White House said only that testing of AI models “will be done in part by independent experts” and focuses on AI risks “such as biosecurity and cybersecurity, as well as its broader societal impacts.”

It’s a good idea to get AI companies to publicly commit to continuing to do this kind of testing, and to encourage more transparency in the testing process. And there are some types of AI risk — such as the danger that AI models could be used to develop bioweapons — that government and military officials are probably better suited than companies to assess.

I would like to see the AI ​​industry agree on a standard battery of safety tests, such as the “autonomous reproduction” tests that the Alignment Research Center leads in pre-released models from OpenAI and Anthropic. I would also like to see the federal government fund these types of tests, which can be expensive and require engineers with significant technical expertise. Right now, many security tests are funded and controlled by the companies, which raises obvious questions about conflict-of-interest.

Commitment 2: The companies are committed to sharing information across the industry and with governments, civil society and academia on managing AI risks.

This commitment is also somewhat vague. Several of these companies already publish information about their AI models – usually in academic papers or company blog posts. Some of them, including OpenAI and Anthropic, also publish documents called “system cards” that outline the steps they’ve taken to make those models more secure.

But they have also withheld information at times, citing security concerns. When OpenAI released its latest AI model, GPT-4, this year, it broke with industrial customs and chose not to disclose how much data it was trained on, or how large the model was (a metric known as “parameters”). It said it declined to release this information due to competition and security concerns. It also happens to be the kind of data that tech companies like to keep away from competitors.

Under these new obligations, will AI companies be forced to publish such information? What if doing so risks accelerating the AI ​​arms race?

I suspect the White House’s goal is less about forcing companies to disclose their parameter numbers and more about encouraging them to trade information with each other about the risks their models do (or don’t) pose.

But even such information sharing can be risky. If Google’s AI team prevented a new model from being used to engineer a deadly bioweapon during prelaunch testing, should it share that information outside of Google? Would that risk giving bad actors ideas about how they could get a less guarded model to perform the same task?

Commitment 3: The companies commit to investing in cyber security and insider threat safeguards to protect proprietary and unpublished model weights.

This one is pretty simple, and undisputed among the AI ​​insiders I’ve talked to. “Model weights” is a technical term for the mathematical instructions that give AI models the ability to function. Weights are what you would want to steal if you were an agent of a foreign government (or a rival corporation) who wanted to build your own version of ChatGPT or another AI product. And it’s something that AI companies have a vested interest in keeping tightly controlled.

There have already been well publicized problems with model weights leaking. The weights for Meta’s original LLaMA language model, for example, were leaked on 4chan and other sites just days after the model was publicly released. Given the risks of more leaks — and the interest other nations may have in stealing this technology from American companies — asking AI companies to invest more in their own security feels like a no-brainer.

Commitment 4: The companies commit to facilitating third-party discovery and reporting of vulnerabilities in their AI systems.

I’m not really sure what that means. Every AI company has discovered vulnerabilities in their models after releasing them, usually because users try to do bad things with the models or bypass their firewalls (a practice known as “jailbreaking”) in ways the companies didn’t anticipate.

The White House commitment calls for companies to set up a “robust reporting mechanism” for these vulnerabilities, but it’s unclear what that might mean. An in-app feedback button, similar to the ones that allow Facebook and Twitter users to report posts that violate rules? A bug bounty program, like that one OpenAI started this year to reward users who find flaws in its systems? Something else? We will have to wait for more details.

Commitment 5: The companies are committed to developing robust technical mechanisms to ensure that users know when content is AI generated, such as a watermarking system.

This is an interesting idea but leaves a lot of room for interpretation. Until now, AI companies have struggled to come up with tools that allow people to tell whether or not they are viewing AI-generated content. There are good technical reasons for this, but it’s a real problem when people can pass off AI-generated work as their own. (Ask any high school teacher.) And many of the tools currently promoted as being able to detect AI output really cannot do so with any degree of accuracy.

I am not optimistic that this problem is fully fixable. But I’m glad that companies promise to work on it.

Commitment 6: The companies commit to publicly report the capabilities, limitations and areas of appropriate and inappropriate use of their AI systems.

Another sensible promise with a lot of wiggle room. How often will companies have to report on the capabilities and limitations of their systems? How detailed will that information need to be? And given that many of the companies building AI systems have been surprised by the capabilities of their own systems after the fact, how well can you really expect them to describe them in advance?

Commitment 7: The companies commit to prioritizing research into the social risks that AI systems may pose, including avoiding harmful bias and discrimination and protecting privacy.

Committing to “prioritizing research” is about as vague as a commitment. However, I’m sure this commitment will be well-received by many in the AI ​​crowd, who want AI companies to make preventing imminent harms like prejudice and discrimination a priority over worrying about doomsday scenarios, as the AI ​​security folks do.

If you’re confused by the difference between “AI ethics” and “AI safety,” just know that there are two warring factions within the AI ​​research community, each of which believes the other is focused on preventing the wrong kinds of harm.

Commitment 8: The companies are committed to developing and deploying advanced AI systems to help address society’s greatest challenges.

I don’t think many people would argue that advanced AI should no be used to help address society’s greatest challenges. The White House lists “cancer prevention” and “mitigating climate change” as two of the areas where it would like AI companies to focus their efforts, and it will get no disagreement from me there.

What makes this goal somewhat complicated, however, is that in AI research, what starts out looking frivolous often turns out to have more serious implications. Some of the technology that went into DeepMind’s AlphaGo — an AI system that was trained to play the board game Go — proved useful in predicting the three-dimensional structures of proteins, an important discovery that accelerated basic science research.

Overall, the White House’s deal with AI companies seems more symbolic than substantive. There is no enforcement mechanism to ensure that companies follow through on these commitments, and many of them reflect precautions that AI companies are already taking.

However, it is a reasonable first step. And agreeing to follow these rules shows that the AI ​​companies have learned from the failures of earlier technology companies that waited to engage with the government until they had problems. In Washington, at least for technical regulation, it pays to show up early.

By admin

Leave a Reply

Your email address will not be published. Required fields are marked *