MidJourneylla tuotettu maisemakuva

What is AI-generated art?

13. September 2022

Amendment 1/2024: Due to the rapid pace in development of generative AI tools, we urge the reader to take note of the original publication date of the following article. The info presented here might be vastly out of date either partially or in full.

In the following description, I will attempt to simplify artificial intelligence (AI) in the context of image generation so it is as understandable as possible. This article does not intend to be a scientific definition of the subject but rather a primer for beginning to understand a complex topic. Here we will discuss narrow artificial intelligence (or ANI), which merely runs predefined tasks according to user input. A more sophisticated AI would be Artificial general intelligence (AGI), which is only theoretical now and would be on par with human capabilities. In this text, we are further focusing on computer vision, how text-to-image -generators work, and how youth workers could leverage them for purposes of youth work.

Learning to be intelligent

What does artificial intelligence or AI mean, then? To put it simply, an AI is a computer program. Most often, AI:s are used to deal with large amounts of data. In this kind of AI:s the most common use is to recognise repeating patterns or find discrepancies in that data and make an analysis based on that. This process is called artificial intelligence because it resembles the human capabilities of deduction, combining information and creating something new based on that process.

Before creating machine learning models for generating digital images, researchers developed machine vision for recognising objects and subjects in digital still and video images. This can include things such as whether the picture contains people or whether it depicts a daytime or nighttime scene. For each separate task, an AI needs to be specifically trained. For example, researchers can train a machine vision model by inputting 1000 photographs of a human face. The computer will then analyse the data by scouring the pictures for repeating patterns. These common patterns would eventually form something that the human training the algorithm would name “a face”. Then the human operators can further direct the AI to look for the same recurring patterns in different images and call them “a face”. 

To slightly simplify things, machine learning means that the algorithm – or AI – is adding newly recognised patterns as a part of its existing database. This way, it can get more adept at identifying human faces in an image the next time round. This kind of AI is already present in almost every smartphone camera, where the software can recognise a face in the image and focus on it even before the user takes the picture. Researchers can use the same process to train algorithms to identify a variety of objects in images and eventually create a diverse (yet still, by definition, a narrow) AI that can find a plethora of things in existing images.

Let’s approach the topic from another perspective. An ongoing debate is whether current narrow artificial intelligences should be called augmented intelligence. A narrow AI cannot function without a human operator who defines the boundaries within which the AI operates. We must be able to ask the computer the right questions – only then can it give us the information we seek, sometimes even faster and more accurately than what we are capable of. An AI could be imbued with every possible move of a chess game, including the rules and goals and have it analyse the best possible way to win the game after every move. The computer sees the game as a changing labyrinth, where it is trying to find the shortest route out. The machine will test every possible maze turn after each move and decide the best move according to the parameters set by the user. In chess, this straightforward strategy will leave holes in the defence and allow a human player to strike back. Human creativity and adaptability to changing situations is something that narrow AI:s cannot be programmed for. 

We can look at a third example. When writing this text, I have repeatedly got a wavy red line under a word I have written. The word processing software I’m using can suggest corrections and improvements to the text I’m using. (Translator’s note: To take things even further, the translation from Finnish to English has been corrected by Grammarly, which uses an algorithm to correct wording, punctuation and clarity of written text in real-time, based on my indicated writing style and target audience. Here one can see that the tools available in English are far more advanced due to the English language providing the AI:s with much more material to train on than the diminutively small Finnish language.) To my chagrin, I must note that the programme isn’t at this time able to output a whole block of text from the input “explain AI in a simplified way to youth workers”. Text-generating AI:s do exist, though. Some text- and image-generating AI:s are already so advanced that it is nigh impossible to discern whether a human or an AI has produced the content. Even these models require input, i.e., user-defined guidelines for content creation. AI:s still lack creativity and intuition, although modelling facsimiles of even these is underway.  

A screenshot of a word processing program called Grammarly informing the user of errors in the written text
An AI at work: Grammarly informing the translator of mistakes in the written text.

Artificial Intelligence as a tool for visual art 

An AI that analyses images or videos is called a Computer Vision (CV). This kind of AI will analyse the content of a picture based on the tasks or parameters given. In industrial applications, companies can use CVs to count things, identify different objects, or be part of quality control processes. Law enforcement can benefit from a CV that reads license plates and compares them to any outstanding fines or other things in their database. CV is also used to categorise pictures; in your smartphone gallery app, you might already be able to search for all the photos containing cats or dogs. These algorithms are also used to improve the results of online search engines. 

How does computer vision see the images, then? One apt way I have seen describing it was a comparison to a human staring at the sky and finding familiar patterns in the clouds. These patterns are not very likely to resemble the likeness of an elephant, a car or a cartoon character. Still, we tend to reinforce our interpretation of our senses and can experience seeing a face in the clouds. The engineers over at Google tested this on an AI. They tasked their algorithm to alter the image according to the patterns identified in a given image. When the engineers repeated the process 5-10 times, even the human eye started to discern the patterns in the picture that the CV identified. The higher the number of iterations, the more pronounced the AI’s effect on the image. The process yielded some outright scary photos, even if some were dreamlike and beautiful. Why were some of the pictures scary, you ask? The most common version of the process that Google dubbed “DeepDream” was a version that searched pictures for images of dogs.

A picture of mona lisa, processed by the deepdream ai to enhance patterns that contain dogs
"Mona Lisa" with DeepDream effect using VGG16 network trained on ImageNet.jpg. Source: Wikimedia Commons.

Most AI-generated images start with white noise, like the one you could have seen on an old TV. This noise acts like a cloud in the sky, where the algorithm starts to look for the patterns it has been instructed to look for. Some AIs can also generate images based on a previous picture, drawing or artwork.

Most of the time, an AI-generated image will not turn out precisely how you wish. This can happen for many reasons. One of the most likely reasons is that the AI isn’t skilled enough to interpret your input accurately. The AI might understand what your wish is about, but it doesn’t have enough data to recreate a photorealistic image. Another reason might be that the input you specified wasn’t accurate enough since even we humans occasionally have different interpretations of language. (Translator’s note: I saw a lovely quote online: “It’s less about getting what you asked for and more about dreaming out loud with a machine”.)

a grainy old looking black-and-white photograph of what seems to be a carnival
An image of a 1930's carnival produced with the DiscoDiffusion AI

The research of AI and its application for art production has advanced by leaps and bounds after Google’s DeapDream project. Several universities, companies and private individuals have started creating versions of processes to produce images with artificial intelligence. One of the most advanced ones is Dall-E 2 by the OpenAI research group. This AI has not been fully published yet (as of August 2022) for fear of user abuse. Dall-E 2 is capable at its best of producing photorealistic images based on text input alone. Google has their own AI called Imagen, and Meta has one called Make-a-Scene. The most notable player on the open source side of things is Stable Diffusion. Google hasn’t published their AI yet – for the same reasons as Dall-E 2 – but you can already test Stable Diffusion.

Researchers might be rightly worried since technologies like these can be misused. In California, legislation has already been created to limit the kind of images that users can generate. For example, users cannot create pornographic material of existing persons or make any material of candidates six months before elections. The EU is also preparing a new artificial intelligence act.

An AI-generated image of cats riding a rollercoaster
An image generated by our trainee Pii on Dall-E 2 with the prompt "A photo of cats riding a rollercoaster"

It is good that preparations are being made for AI and the possible adverse phenomena they may bring. Personally, though, I don’t believe that regulations and laws can ever entirely prevent the misuse of technology. Similar technologies have already been used in the war in Ukraine for propaganda purposes. In youth work, it is essential to recognise the risks associated with these technologies and address the capacity to distinguish AI-generated images from real ones. 

Although not all AI:s are available for us to use yet, we already have several opportunities to start experimenting with text-to-image technologies. Midjourney [https://www.midjourney.com/home/] is already in an open beta test, and the full version will be published soon. (Translator’s note: There is an image generated by Midjourney among Verke’s recent Instagram posts. I challenge you to find it!) Disco Diffusion [https://colab.research.google.com/github/alembics/disco-diffusion/blob/main/Disco_Diffusion.ipynb] is an open-source AI that you can install on your computer. Stable Diffusion can be tested in DreamStudio [https://beta.dreamstudio.ai/], for example. 

Midjourney works on Discord, a familiar platform for many youth workers and young people alike, especially after the pandemic and is easily approachable. Disco Diffusion is housed on Google’s Colaboratory -platform, which can initially feel a bit challenging. The benefit of Disco Diffusion is that it lets us peek under the hood at what kind of processes and code are needed for an image-generating AI. This kind of collaboratively created open-source application helps us learn and understand existing AIs better and can be an excellent platform for educators.   

AI art in youth work and art education

One of the aims of Digital youth work is to familiarise young people with digital technologies and environments in a nonformal way and help them develop an open but critical relationship with them. Whatever you might think of AI-generated art, it is here to stay now that Pandora’s box has been opened. The youth field needs to keep on top of new technologies and apply them creatively to daily youth work practice. Young people might not be aware of these new technologies yet. On the other hand, formal education can be slower to include these new developments in their curriculums. It is comforting to remember, though, that youth workers don’t need to be professionals in the field of AI. Youth work expertise is enough, with the ability to guide a group of young people together to explore new technologies and opportunities. Often it can be enough to guide young people to the right service, show a couple of examples, and the rest will follow almost by itself. A youth worker doesn’t have to know how to do everything; as with many other avenues of youth work, it is enough to say, “I don’t know, but let’s find out together”.

What could be the youth work applications for AI-based image generators? Youth workers often employ emotion cards or something similar to facilitate the beginning of group sessions. With an AI generator, everyone could generate an image based on their current mood or thoughts and help reflect on their feelings. Alternatively, everyone could create a picture of their dream of the previous night, and others in the group could guess what the image depicts. 

In art education, these AIs could generate a quick sketch that the young person can work on further. AIs can even support young people (or adults, for that matter) who don’t see themselves as skilled or talented in visual arts. These approaches can open up new avenues of self-expression when the technical skills of creating a visual artwork aren’t necessarily at the centre of the process. All you need to do is express what you want to create. Some AIs also accept a simple sketch as a starting point to refine. One example of this is depicted below. 

A crude sketch on the left and a detailed AI-generated painting on the right of cliffs and waves
A painted scene produced with MidJourney based on a crude sketch by the author of this text.

In essence, these AIs never create anything completely new. Existing artworks, images and photographs are the base ingredients from which the AIs build their images. This existing data does enable the AIs to mimic the work of existing artists. This capability can be leveraged in art education, for example, to explore the style of specific painters. Educators can present different art styles to young people and have them pick their favourites to explore further. This might motivate young people to study art history even more.

Several users have run tests on how different prompts – such as names of artists or art movements – affect the final image. These can be a good aid for helping a young person start their journey through art history and find a style they find appealing. The following examples are meant for the Disco Diffusion AI, but I have noticed that other AIs have similar capabilities in recognising the styles of artists and art movements.

https://weirdwonderfulai.art/resources/disco-diffusion-70-plus-artist-studies 

https://weirdwonderfulai.art/resources/disco-diffusion-modifiers/ 

AI-generated images can also be an excellent way to generate content for virtual reality-based youth work. The AltspaceVR-environment, for example, can be used to create a virtual youth house with sort of info displays linking to different websites. This could enable the production of images within the virtual space and further produce an AI-generated art exhibition within the AltspaceVR environment. 

The possibilities are almost limitless – use your creativity and apply the new technology to your youth work! 

A word on copyright

From the viewpoint of copyright, an AI is like any other device or tool used in content creation and has no rights to the images it has generated. An AI is comparable to a camera that an artist uses by setting up the camera correctly, framing the shot and pressing the shutter button. Likewise, the AI is used by defining settings (via prompts) and pressing enter to have the image generated. It is sure, though, that legislators will re-examine these immaterial rights shortly. A contest point is likely whether AIs create new, unique images when drawing on existing works, where copyright remains on the original authors.

A separate discussion is what are the rights of artists whose images are being used to both train AIs and are used to generate new images. An artist might have used years to hone their technique and recognisable style, and their work can now be replicated by the prompt “..by [artist name]”. Is it fair to copy and model someone’s life work like this? On the other hand, art has always been based on learning from others, copying, and developing existing techniques and ideas. Could AI be seen as a new kind of incubator for culture, which then evolves faster than in an ordinary environment? Legislators and the general public will examine these questions in the future, but at least for now, everything is on the level with existing legislation and guidelines. 

Dissonant voices against AI-generated art

AI-generated art has already stirred up a lot of fears and critiques. The most common themes are the potential for abuse of this technology and the narrowness of the current AIs. 

The risk of abuse is undoubtedly accurate in these times of fake news and malicious bot accounts on social media. Individuals or groups could leverage the technology for nefarious purposes by distorting historical or current images to emphasise a false narrative. The only antidote from a youth work perspective, for now, is to help young people learn how to recognise these kinds of images to facilitate a higher level of media skills. I am sure that including metadata in images will become compulsory to help people filter out AI-generated imagery more easily. Researchers developing the AIs have also limited the kind of prompts people can use to make it harder for users to abuse the platforms.

Another recurring critique is the narrow scope of the current AIs. This is not referring to the type of AI (narrow vs general AI) but rather the fact that the AIs only do what the creators have programmed them to do. Since most programmers are still men, critics can argue that this has led to AIs not being programmed to represent the diversity of humanity fully. One potential long-term solution can be to educate a diverse group of people on AI. The youth field has a vast potential to generate interest in these technologies and careers for young people. The technology-infused world needs a diverse pool of programmers, researchers and enthusiasts for various tasks. Especially in the field of AI development, diversity is and will be critical. People with different backgrounds and world views are needed for AIs to see (and generate) the whole picture.

The third central critique is whether AIs will now displace all artists. The short answer is a resounding “no”. As stated, AIs can only generate images based on the material researchers have trained them on. This material is all created by humans, and new material is constantly needed so that the AIs don’t start repeating the same old themes. When you use one of these platforms for a mere couple of months, you can most likely begin to see repeating patterns emerge. No matter how fancy the AI-generated images may be, they don’t make anything completely original. The AIs can – and probably will to a growing extent – be used by artists as a tool to generate sketches and concepts to be further refined quickly. This can be useful whether you are an aspiring artist or illustrator or someone working in graphics design. 

Finally, I want to address the environmental issues surrounding the topic. AI-generated art uses a lot of processing power and memory. With a low-powered home computer, you might not even want to attempt using some of these platforms. What follows is that AI-generated art also has a carbon footprint. Therefore it is advisable for youth workers to also discuss the environmental effects of these technologies with young people as a part of the youth work process.

If you want to learn more about AI, the University of Helsinki has a free online course. You can access the course here: https://www.elementsofai.com/

View all blog posts