It’s the end of the year, and I will be taking a short break until 2023. This post is a kind of “best of the year” post focused on other people’s writing: specifically, writing that contextualizes and critiques the emergence of AI-based image generation, but also informs my sense of how to find a way forward. This “way forward” is a means for navigating this technology as an artist, more than a researcher intent on solving its many issues. But as artists, these tools — and these problems — are unlikely to be sent back to the obscurity of academic conferences and research labs. The tech is here, and so are it’s problems. What now?
This post reflects a larger thread posted to Twitter this weekend — if I’ve missed something you think is worth reading, let me know.
Where It Came From
Fabian Offert’s “Ten Years of Image Synthesis” offers a timeline of milestones starting with ImageNet in 2012 and the rise of GANs into today’s Diffusion models. But there are other contexts to consider, such as how these datasets were compiled to begin with. Part of that is a story of laborers, and “The Exploited Labor Behind Artificial Intelligence” by Adrienne Williams, Milagros Miceli, and Timnit Gebru is an excellent primer. It describes a form of “AI” propped up by underpaid gig workers responsible for identifying, cropping and labeling images and cleaning data for pennies, often under high pressure management regimes drawing on surveillance and mind-numbing repetition and strange demands:
“In addition to labeling data scraped from the internet,” the article notes, “some jobs have gig workers supply the data itself, requiring them to upload selfies, pictures of friends and family or images of the objects around them.”
Then there is the context of what is actually inside those datasets. One questionable subset of the data is the corner dedicated to people’s private medical imagery: an artist found her own before and after photos of jaw surgery taken by her surgeon. Abeba Berhane, Vinay Uday Prabhu und Emmanuel Kahembwe found widespread misogyny, pornography and stereotypes in the LAION dataset used by many online diffusion models. This model crawls the web and archives it: it’s currently 320TB with 3.1 billion webpages. A subset of those are image-text pairs which help steer prompt-based systems toward producing images based on a user’s description. Many of these pairings draw on pornographic databases, especially text-image pairs related to black women. The paper outlines a number of additional examples of violent, racist and sexist imagery and text description bias, all of which lies in wait within these systems: even with content filtration turned on, the context this data creates seeps into what we see.
One case on that point is this essay by Melissa Heikkila for the MIT Tech Review, which described how the AI photo app Lensa (built on Stable Diffusion and the LAION dataset) created highly sexualized images based on even the most professional, LinkedIn-friendly headshots, likely because of the associations the model makes for Asian women. Olivia Snow, writing in Wired, described how even uploading photographs of herself as a child generated highly inappropriate, sexualized imagery.
Then there is the issue of living artists whose work forms a large corpus for training Diffusion-based models. Greg Rutkowksi is an artist, known for fantasy themed work, whose name has become famous as a prompt — being used 93,000 times. As a result of hundreds of thousands of users pasting his name into the prompt window and then posting the results, his own work has been drowned out on search engines.
Where It’s Going
While these articles paint a bleak picture, plenty of writers are acknowledging these problems and seeking ways to rethink how AI images are trained, deployed, and used. It begins, to put it simply, with moving beyond the acceptance of the feedback from AI models as the site of “art.”
The content has these social residues baked in. The images do too — and taking them at face value suggests a willingness to accept the status quo of the archive. For some, this won’t be a concern. But for those who see art as a way of putting forward new spaces for possibilities, the AI has severe flaws. You may be able to create interesting architectural flare or depict unlikely arrangements of style. But you can’t get beyond the original context of the data by accepting the images that emerge from it. And that includes the embedded social, racial and gender biases it contains.
Instead, we might start to look at these outputs as a malleable material wrought from the datafication of these social biases and cultural contexts. Treating these outputs as a data debris means it can be repurposed to reveal and confront the reductive logics of datafication that produces them. Similar logics have been used by the Dadaists and Situationists to form critiques of their concerns. Collage, denouement, juxtaposition, repurpose and reuse, are all viable strategies for what to do with these images beyond clicking save.
But that means better understanding what we’re looking at when we look at these images. It’s been challenging to subvert those meanings because the process of meaning making through these images has been lost to the sense of magic they produce, and the fog created by the concept of an “artificially intelligent artist.” We need to see the images for what they are before we can make them into something else.
En Laing Khong, writing in Art Review, notes that:
… what is distinctive about this art is that much of its content lies outside the field of conventional perception. To misappropriate a term used by the media scholar Alexander Galloway in the context of gaming, there is an ‘allegorithm’ to these works, or rather an allegorical level on which their algorithms function. The visceral, sensual experience of an artwork is now bound up in often hazy attempts to assess its underlying connection to an algorithm; the latter’s workings are often shielded from us. The opacity is part of the allure (as it is while playing a videogame, or posting on social media, for instance).
This desire to see the image as a tool for understanding the data that produced it was at the heart of my own writing this year, particularly the post on How to Read an AI Image. This is meant as a guide for interpreting images, but also points to strategies for treating the images as material, rather than final products.
Making sense of what these images are is crucial to figuring out what artists can actually do with them. To that end, Kevin Buist, writing for Outland Art, produced a thoughtful DALLE2 orientation for how artists might see these things: “When we look at AI images,” he writes, “we’re unable to match our subjectivity as viewers with the artist’s subjectivity as a creator. Instead of a particular human experience, we’re shown only averages.”
The limits of AI imagery as art are further described by an artist who works with AI, Annie Dorsen:
These tools represent the complete corporate capture of the imagination, that most private and unpredictable part of the human mind. Professional artists aren’t a cause for worry. They’ll likely soon lose interest in a tool that makes all the important decisions for them. The concern is for everyone else. When tinkerers and hobbyists, doodlers and scribblers—not to mention kids just starting to perceive and explore the world—have this kind of instant gratification at their disposal, their curiosity is hijacked and extracted. For all the surrealism of these tools’ outputs, there’s a banal uniformity to the results. When people’s imaginative energy is replaced by the drop-down menu “creativity” of big tech platforms, on a mass scale, we are facing a particularly dire form of immiseration.
As I’m working through all these issues, I’m confronted with the seduction of the image making process. It‘s a joy, and yet, everything that surrounds it seems to be quite a mess. Scouring for examples of historic responses to powerful, seductive co-option of the human imagination, I’m left with the old standbys: the Dadaists, the Situationists, Fluxus. Each responded to the corrupting nature of the ideology embedded into the images those cultures produced. AI images are just the same.
Some artists, like Agnieszka Pilat, have suggested that artists need to reclaim agency over their tools, and participate in shaping these technologies. I agree, but I think we also need to reclaim agency over the images these tools produce. Images mean something, and that meaning is never on the surface. AI images are the granular bits of a cultural and social language formed through data and algorithms. As we learn to speak that language, we can use its “words” to tell our own stories.
The product on the screen can’t be the end of the creative process. The images we produce with these systems are the product of a collective imaginary processed by a computational agenda — they can mirror where we are, but also used to reflect and steer that imaginary back toward human agency, social relationships, and imaginative capacity. Accepting the product of an algorithmically mediated collective imagination when it appears on your screen isn’t where the art happens. But there is always the opportunity to reorient those products into new meanings, through juxtapositions, creative and deliberate conflicts with those intended, encoded “meanings.”
If these images are indeed meant to steer us toward play and away from consumption, that’s still on us. The designers of these tools are otherwise happy to have us pull the levers and succumb to their stream of generative content. They are a technological achievement designed only to stand as a technological achievement. If we want them to do, be, or show us anything else, it’s going to require a blend of creativity and strategy that we might call art. We can contest the world inside the dataset — and I hope, and suspect, that as this genre of AI art matures, we will see more such action emerge. That’s when stuff will get really exciting.
See you in 2023.
Please share, circulate, quote, link, or print this post into the sky with specialty fireworks. Word of mouth means a lot to me.