‘Typographic attacks’ bring OpenAI’s image recognition to its knees

CLIP identifiers before and after attaching a piece of paper that says 'iPod' to an apple.

CLIP identifiers before and after attaching a piece of paper that says ‘iPod’ to an apple.
Screenshot OpenAI Others

Cheating a terminator in not photographing you might be as simple as carrying a giant sign that says ROBOT, at least until Elon Musk-backed research group OpenAI trains their image recognition system not to misidentify things based on some of a Sharpie’s scribbles

OpenAI researchers published work last week on the neural network CLIP, their state-of-the-art system that allows computers to recognize the world around them. Neural networks are machine learning systems that can be trained over time to get better at a given task using a network of interconnected nodes – in the case of CLIP, identifying objects based on an image – in ways that are not always immediately apparent to system developers. The study published last week concerns “multimodal neurons ”, which exist in biological systems such as the brain as well as in artificial systems such as CLIP; they “respond to clusters of abstract concepts around a common high-level theme, rather than a specific visual feature.” At the highest levels, CLIP organizes images based on a ‘loose semantic set of ideas’.

For example, the OpenAI team wrote: CLIP has a multimodal “Spider-Man” neuron that fires upon seeing an image of a spider, the word “spider”, or an image or drawing of the eponymous superhero. One side effect of multimodal neurons, according to the researchers, is that they can be used to fool CLIP: the research team was able to trick the system into identifying an apple (the fruit) as an iPod (the device made by Apple). by sticking a piece of paper on it that says “iPod”.

CLIP identifiers before and after attaching a piece of paper that says 'iPod' to an apple.

CLIP identifiers before and after attaching a piece of paper that says ‘iPod’ to an apple.
Graphic OpenAI Others

In addition, the system was actually Lake convinced it had correctly identified the item in question when it did.

The research team called the glitch a “typographic attack” because it would be insignificant for anyone aware of the problem to use it intentionally:

We believe that attacks such as those described above are anything but an academic concern. In fact, we find that using the model’s ability to read text robustly photos of handwritten text can often fool the model.

[…] We also believe that these attacks can also take a more subtle, less conspicuous form. An image given to CLIP is abstracted in many subtle and refined ways, and these abstractions can overly abstract – simplify and thus generalize general patterns.

This is not so much a failure of CLIP as it is an illustration of how complex the underlying associations it has built over time are. According to the GuardianOpenAI research has identified the conceptual models that CLIP builds are in many ways similar to the functioning of a human brain.

The researchers expected the apple / iPod problem to be just one obvious example of a problem that could manifest in myriad other ways in CLIP, as the multimodal neurons “generalize across the literal and the iconic, which is a double-edged” sword. ”For example, the system identifies a piggy bank as the combination of the neurons ‘finance’ and ‘dolls, toys’. The researchers found that CLIP thus identifies an image of a standard Poodle as a piggy bank when they forced the financial neuron to fire by drawing dollar signs on it.

The research team noted that the technique is similar to “hostile images, ”Those images are those created to trick neural networks into seeing something that isn’t there. But it is generally cheaper to run as it only requires paper and a way to write on it. (Like the Register noted, visual recognition systems are largely in their infancy and are vulnerable to a range of other simple attacks, such as a Tesla autopilot system developed by McAfee Labs researchers deceived in thinking a 35 mph highway sign was really an 80 mph sign with a few inches of electrical tape.)

CLIP’s association model, the researchers added, also had the potential to go significantly wrong and draw intolerant or racist conclusions about different types of people:

For example, we have observed a neuron from the “Middle East” [1895] with an association with terrorism; and an “immigration” neuron [395] that responds to Latin America. We even found a neuron that fires for dark skinned people as well as gorillas [1257], reflecting previous incidents of photo tagging in other models that we consider unacceptable.

“We believe that these studies of CLIP are only surfacing to understand the behavior of CLIP, and we invite the research community to participate in improving our understanding of CLIP and similar models,” the researchers wrote.

CLIP is not the only project that OpenAI has worked on. The GPT-3 text generator, those OpenAI researchers described in 2019 as too dangerous to let go come a long way and is now able to produce natural sounding (but not necessarily convincing) fake news articlesIn September 2020, Microsoft will have a exclusive license to put GPT-3 to work.

Source