Gold Rush in the Uncanny Valley — Pt II: The Case for Creativity
5 MIN. READOur CTO, Kasper Kuijpers, explores latent space in Stable Diffusion 1.5.
Human society has always forged new tools and has adapted them for practical and intentional purposes but also creative and accidental ones.
As I write this article (not using GPT, I promise), I draw from what I’ve read, observed, and listened to throughout my life. I’ve stumbled across connections and associations between them that I can use as vehicles to help me better explain my points to you. The beauty and the problem here is that I haven’t read the entirety of human knowledge, however, my lived experiences offer me a useful frame of reference.
Unlike my knowledge, the current models being released and the modes of interacting with them are creating access to much much larger parts of human knowledge (Hyperboliism alert: in this conversation between Lex Friedman and Sam Altman, it’s being likened with the invention of fire and or the wheel. Time will tell...).
The ability to understand and navigate these could allow for so many new connections, associations and novel outputs to be formed, and it stands to reason that what's coming next could be unfathomable. In this article I intend to scratch a little off the surface to shine a light on where to look and what could be possible, under the lens of “AI Art.”
The problem with AI Art
There are accurate reflections as to why AI Art isn’t art (ref Convivial Society article); in summary - art is a form of expression from artist to observer, seeking to express and communicate a set of intentions and beliefs, and the observer can ponder what the other human mind was intending to express and communicate. It's a dance, an indirect dialogue between minds.
I agree with the writer that this dialogue between minds gets lost when an AI is the (main) creator, especially when it just becomes too easy (hi Midjourney 5) - the effort bar becomes so low. With MJ5 there’s little art of prompt engineering needed; it just gives you a heavily steeped in Western culture, high-quality director of photography-like output, whether you like it or not.
However, I believe that there are plenty of practical use cases in which a very low barrier to high-quality hallucinations is useful. For example, replacing stock photography or as a means to quickly explore ideas/mood boards - that otherwise would have never been tried due to the prohibitive production cost.
Artfulness comes back when we intentionally use the tools and master them, as well as the materials. The parallel that is beaten to death is the introduction of photography. When photography rose to popularity, naturally, painters were concerned about their role in art and defensively claimed photography would never be seen as “real art”. What’s the fun and point of seeing totally realistic depictions of what you can already experience with your own eyes? As we know, paintings and photography serve different artistic purposes, so can the same be said for AI art?
AI’s grain
As someone who spent the last 20+ years building websites, I keep thinking about the web’s grain - a parable introduced by Frank Chimero in 2015. The idea: how, like with woodworking, particular materials have a specific grain and require specific tools and handling (fit for material). They all have qualities and constraints that will help you reach certain outcomes.
Having been steeping in, observing, and getting my hands dirty with all the development that has been set into motion, I have come to realize that this is true here too. The materials are the notebooks, platforms and models we have at our disposal.
Dall-E has certain qualities, as well as Midjourney between their versions, and so does Stable Diffusion. There are different datasets and differences between GPTs, and so on. Moreover, you have your hammers and saws (prompting), chisels (inpainting), and sandpaper (samplers, cfg scales). But also complete sawmills (Automatic1111, ComfyUI and Dreambooth for training). Today, people are getting very good at using and combining these tools creatively.
I think a lot about folks like K Allado Mcdowell who have written books (Pharmako AI, Air Age Blueprint) references) with GPT2 and GPT3. They have gotten so intimate with these tools that it would not come as a surprise if they end up sticking to/preferring certain ‘old’ versions because of the style/aesthetic they bring to the creative process.
I also think we haven’t mined the old versions enough yet for their specific aesthetics and qualities hidden in latent space, left by the wayside in the pursuit of photorealism (and realistic hands).
It begs the question what or who are we optimizing for?
Patten underlines this train of thought on this podcast with his recent release, Mirage FM. Solely constructed out of low-quality samples generated with Riffusion, heavily exploring and leaning into the experimental tool’s odd and imperfect aesthetics (what he refers to as “crate digging in latent space”). An ode to these imperfect times before it blips out of existence in the pursuit of perfection.
But, while I can be very good with my tools, I can still make stuff that can be considered not creative or unoriginal; the same applies here too.
Grappling with the tools, materials, and techniques brings back the exploration, the intent and searching for specific ways to express something. It brings back the mind into the conversation. But this is one part of the puzzle. In many cases, you will still be much faster to reach a result, and the number of outputs will grow overwhelming.
The observer, curator and creative
A funny thing happens when working with AI in this way: beyond being creative, you suddenly become aware that you also need to play the observer and the curator role. Reflecting on the outputs the machine gives you; jamming with them, and eventually evaluating them.
The ability to curate, to discern good from bad, and to think critically if it’s expressing your intent becomes much more important. Henry Daubrez succinctly captured it on a panel we both were on. Whether or not we have 10 outputs or 10,000 outputs, taste remains. Identifying what's original, tasteful and the curation thereof will become a more important skill in this age where the path to production can be so easy.
I rely heavily on human minds, the frame of reference, my experiences, and those of my friends, peers, and colleagues. They help me curate and refocus when meandering through latent space or crate digging for references to use in prompts.
Ultimately, these minds and inner experiences are who we are communicating to and creating for (for now). And so we should not only look at productivity gains, but we should embrace AI also as a tool for creativity and, explore its many to-be-discovered nooks and crannies as a whole new era of creativity unfolds itself.
TL;DR
What is the artistic purpose of AI art? Who or what are we optimizing for? How can we best curate and distil our creative outputs? In the 2nd part of his ‘Gold Rush in the Uncanny Valley’ series, CTO Kasper Kuijpers further explores the intersection of AI and creativity. He discusses the potential of AI as a tool in a kit of creative parts and the importance of curating creative outputs in an era of easy production.
10 Things
Join 5000+ creatives, makers, and marketers who receive our beloved newsletter covering all things tech, design, and wonderful internet culture.