Glazing over security

Glaze is a piece of software that aims to protect artists from having their work used to train machine learning models that mimic their style. Glaze does this by “perturbing” an artist’s images, so that training a machine learning model on these perturbed images won’t work (in that, the model trained on the perturbed data won’t generate images that nicely mimic the targeted style). The Glaze authors are heavily pushing this tool as an effective defense (e.g., promoting it with articles in the New York Times).

Unfortunately, this style of perturbation defense against machine learning just doesn’t work. We’ve broken previous versions of schemes just like this. And we recently put out a paper showing that Glaze doesn’t work either (and no, you can’t just patch Glaze and say “there, I fixed it!” because once the tool is broken, any security it provided is irremediably lost). We encourage you to read the paper for details on why these types of schemes don’t work, but that’s not going to be the (main) focus of this post.

Rather, today, we want to talk about how we believe the Glaze team misses the mark on how to properly care for the security of their users, by:

  1. trying to prevent other researchers from evaluating Glaze
  2. misrepresenting methods that succeed in bypassing Glaze
  3. making misleading claims about how Glaze can recover security after a break

Sounds like a lot, so let’s dig into it.

Academic security researchers should invite third-party analyses of their methods, and release source code

The senior professor leading the Glaze project refused to engage with the research community and provide the source code of their project to other researchers so they can study its robustness (this includes both of us in independent requests, and possibly other researchers too).

It’s hard to perform security analysis of tools when you’re not given access to those tools. It’s not impossible (we ultimately didn’t need access to bypass Glaze), but it’s harder. Researchers who honestly believe in the security of their systems should be willing to have it audited by other researchersThis doesn’t necessarily mean sharing the code with everyone, but at least with researchers who want to study it..

There are plenty of (mostly-)valid reasons why researchers might not release source code. Some companies forbid their employees from releasing code, and there’s nothing the researchers themselves can do about it. Other researchers don’t release code because it takes a lot of time to disentangle their code from the rest of their research codebase.

But neither of these apply here. Glaze is being actively distributed as a binary and being pushed out to artists as a tool they can use to protect themselves. The authors of the paper have been actively pushing it to the press. So clearly they have a self-contained tool that they believe is useful.

There are other valid reasons not to release code for deployed security tools. In fact, many critical security tools such as spam filters or malware detectors are closed source. Companies rely on such “security through obscurity” all the time to make things more annoying for attackers. And it does make it more annoying. But, unlike Glaze, these companies are not simultaneously writing academic research papers about how secure these systems are.

The problem is that the Glaze team wants the best of both worlds. They want to call Glaze research, publish in USENIX Security, and win awards. At the same time, they want to keep their system secret because it is being used by a vulnerable community. But by choosing to publish at USENIX Security, the authors have decided that this is a research contribution and therefore should be willing to engage with the research community to enable their claims to be falsified (otherwise it isn’t really science…)

Even if this was not the case—and Glaze was just a product—if the authors really cared about the security of their tool they would be happy to have other researchers review their code. You see this happen all the time with companies (or governments) that care about security. They’ll bring in third-party auditors to review their code.

Unfortunately, the senior professor behind Glaze seems to take the opposite stance. On a public Discord channel last year, in response to a different paper that tried to bypass Glaze, he seemed happy that this would deter us and others from investigating Glaze’s security.

Discord screenshot

Well-meaning researchers shouldn’t be celebrating what they perceive as bad science, just because it makes it harder to critique their methods. Instead, well-meaning researchers should be disappointed in “crappy papers” and try to encourage proper analysis of their tools.If you’re really confident about the security of your defense, you might even set up a public challenge to break it. (And boy, did we and likely many others try, and fail!) What better way to convince everyone, including yourself, that your defense works?

The author’s arguments against releasing code.

When asked for the source code, the Glaze senior author repeatedly refused to provide it. He cited two primary reasons for this (and also listed these reasons in public talks), which we will address in turn:

“Right now there are quite literally many thousands of human artists globally who are dealing with ramifications of generative AI’s disruption to the industry, their livelihood, and their mental well being … IMO, literally everything else takes a backseat compared to the protection of these artists”

We don’t disagree in the slightest that we should be trying to help artists. But let’s be clear: the best way to help artists is not to pitch them a tool while refusing security analysis of that tool. If there are flaws in the approach, then we should discover them early so they can be fixed. And that’s easiest to do by openly studying the tool that’s being used.

Ultimately, the time we spent reverse-engineering Glaze (which was actually unnecessary) just delayed us from evaluating its security and disclosing its flaws to impacted artists.

“Releasing code would enable a number of new attacks, including fake glaze binaries that do nothing and just return the original art, making those artists vulnerable to mimicry attacks while simultaneously damaging credibility in glaze”

We don’t understand the logic behind this argument. If someone wanted to release a fake Glaze binary, they wouldn’t need to reverse-engineer the Glaze algorithm at all. They would just create a dummy interface that returns the original image. And even if this argument held water, this doesn’t explain why the authors cannot provide other researchers access to the source code.

To refute an attack, implement it correctly and refute all claims

After we released our paper, the Glaze authors published an update where they analyzed our most successful denoising scheme (we had shared our paper with them four weeks earlier). They acknowledge that our scheme works in some cases, but also note that our denoiser often significantly degrades image quality. They then released Glaze v2.1, and show it is robust to our denoiser.

This sounds great, except that the authors did not actually analyze our denoising scheme.

Instead, they reimplemented it based on the information from our paper, and analyzed that. To then make specific claims about our method requires a very high degree of confidence that the reimplementation was correct. Unfortunately, the Glaze authors did not contact us to check this, nor did they just use the code we had released with our paper (this is another core reason why releasing source code for security research is valuable: you can try to ensure that others evaluate your methods properly).

We tested our own denoiser implementation, and found that most of the claims made in the Glaze update don’t hold at all:

  1. our denoiser does not introduce large artifacts, except for some very low-quality and low-contrast images.

Denoiser comparison

  1. our denoiser remains effective against Glaze 2.1, for multiple styles that we experimented with (e.g., cartoon style, which the Glaze response claimed our method was ineffective against)

Results with Glaze 2.1

We shared these results with the Glaze authors, who included a disclaimer on their website that the claims they make apply solely to their implementation. But they are still leaving those claims there, even though they essentially say nothing about Glaze’s current security.

Another issue with the response from Glaze is that it focuses on just one method presented in our paper (albeit the strongest one). In fact, we show that Glaze can be bypassed to various extent by a multitude of methods, including by doing nothing at all.

Specifically, we found that simply using a different finetuning script than the Glaze authors already weakens Glaze’s protections significantly. Presumably, the Glaze authors used their original script (rather than ours) when evaluating our denoising methods. And so in any case the results would be inconclusive.

This again shows the benefits of having security researchers release code! Since our code is public, it should be easy to reimplement our entire evaluation in a faithful manner.

Don’t mislead users about your defense’s resilience to new attacks

The response from the Glaze team suggests that our results are not a big deal. They will study our methods, and release a patch. After that, artists “will update their tools and re-glaze their images as needed.”

The Glaze authors liken this to a standard situation in computer security:

the “ongoing security battle [of] protective tools like anti-virus scanners, network firewalls, and email spam filters [which] have great utility despite the lack of future proof guarantees.”

We see two issues with this statement.

1. It isn’t clear if Glaze even provides “present-proof” guarantees

Ignoring our attack and future versions of Glaze for a moment, what types of attacks does Glaze actually protect against right now?

The original paper demonstrates that Glaze resists some specific attacks, but as we noted above, these claims do not even generalize to benign changes to the way someone might train their own model.

Glaze likely provides some form of protection, in the sense that by using it artists are probably not worse-off than by not using it (as long as they are fine with the small noise artifacts that Glaze adds to their art). Glaze might also protect against “lazy” or “unskilled” attempts at bypassing it, that use exactly the same attacks and implementations it was made robust against.

But such “better than nothing” security is a very low bar, and one that isn’t really falsifiable in a scientific sense (i.e. any new attack doesn’t technically break Glaze’s promises, since it wasn’t part of the attacks tested so far).

Now this would still be fine if this is how the tool was presented to users: “Apply this noise. It won’t do anything against a motivated attacker. But it might help against someone who’s not really trying.”

But the documentation and marketing around Glaze suggest that it is much more than a “better than nothing” tool. And this could easily mislead artists into a false sense of security and deter them from seeking alternative forms of protection, e.g., the use of other (also imperfect) tools such as watermarks, or private releases of new art styles to trusted customers.

2. Glaze cannot be patched (meaningfully)

The main claim by Glaze’s authors is that users shouldn’t worry too much about new attacks, since they will always be there to patch the tool.

Many security tools (such as the intrusion detection tools mentioned in the Glaze response above) are indeed useful despite being continuously patched against new attacks. But these tools differ from Glaze in two fundamental ways:

  1. Glaze can be attacked retroactively. Once your system is patched against a security vulnerability, your previously vulnerable system typically cannot be attacked retroactively (although in some cases, the attack could be so devastating that recovering from the damage is hard). Thus, the patched system is “secure” again until the next attack is discovered.

    In contrast, patches to Glaze cannot recover any lost security. Once glazed images are placed online, someone can download and store them. If a countermeasure to Glaze is later developed and deployed, an update to Glaze makes no difference for all these previous images. Even if artists re-glaze all their images, old copies with the broken protections may still be available (e.g., on an internet archive or online forum).

  2. Glaze fails silently. When a computer system is attacked, it is typically possible to detect the attack a posteriori and analyze system logs to reveal the vulnerability that was exploited. This is not possible with tools like Glaze. If an artist glazes all their art, and someone later manages to mimic their style with generative AI, we may not even know that Glaze failed. And even if we did (e.g., because mimicked art starts appearing online), we wouldn’t know how Glaze failed!

    Since we released all the details of the technique we used to bypass Glaze, it should be easy to produce a new version of Glaze that resists this specific method for future art (although maybe not that easy, as we noted above). But if someone finds a method to bypass Glaze “in the wild”, they may choose not to release it.

There are few security tools that don’t satisfy at least one of these two properties. This is for a good reason, as a tool that satisfies neither property can break rather catastrophically: it could be silently defeated by an attack, and even if you discover the attack, the patch won’t apply retroactively. Glaze is such a tool—and its developers should acknowledge this and make it clear to users that art that was protected with an ineffective version of Glaze is irremediably unprotectable (this also implies that any art published online before Glaze was released is similarly impossible to protect in a meaningful way).

There is one highly popular security tool that similarly satisfies neither of the two properties above: encryption. Indeed, if someone finds a flaw in your encryption scheme, they can silently decrypt all your communication; and even if you find the flaw and fix it, you cannot retroactively protect your previously encrypted communication. Just imagine the havoc that a break of AES would wreck: there are terabytes of data on the Web that would suddenly be vulnerable. Re-encrypting all this data would be extremely challenging, and also wouldn’t help if someone downloads the vulnerable ciphertexts first.

Because of the possibility for such catastrophic breaks, our standards for encryption schemes are incredibly high. We certainly wouldn’t want an encryption scheme to be broken less than a year after it was deployed.In fact, cryptographers try to phase out encryption schemes decades before they might be broken. Currently, there is a push to standardize schemes that resist attacks from quantum computers, even though a viable attack is nowhere in sight yet. And this is also why serious cryptographers: release all the details and code of how their encryption scheme works, let other researchers scrutinize it, and deprecate schemes that start to show weaknesses (notice a pattern?).

We don’t know how to design schemes for protecting against machine learning that meet a similarly high standard. So we should maybe just be transparent that such schemes are mostly for show.

Conclusion

So what did we learn from all of this? We hope that we’ve convinced you why it’s not good for your users’ security to: (1) rely on security through obscurity and actively make it hard for researchers to study your defense; (2) refute attacks without implementing them properly and thoroughly; and (3) make misleading claims about how your tool can recover security after being patched.

If Glaze actually aims to provide security for its users, the team should be willing to engage with researchers who wish to study their system. Especially since they were the ones who decided to publish their work at a computer security conference. And when attacks on the system come out, we think their many users deserve a more thorough analysis of how these attacks affect the system, and what security their system can provide in the longer term despite them.