Astriata logo mark in orange

Our Experiment with the AI Image Generator DALL-E 2

Illustration of a human painter next to a robot painter
January 26, 2023

You know the saying: “a picture is worth a thousand words.”

You likely also know that countless studies confirm the power of visuals to communicate in a highly efficient and effective manner. In fact, “the human brain can process entire images that the eye sees for as little as 13 milliseconds,” according to neuroscientists from the Massachusetts Institute of Technology (MIT). That’s an incredibly fast processing speed!

Compelling visuals attract attention and engage audiences. Our brains are primed for visuals, scientists say, and that’s what makes images an essential element of marketing, communication, and storytelling.

Over the last few weeks, we’ve explored how new generative artificial intelligence (AI) tools like ChatGPT can support your content creation and other needs. Now, we turn our attention to DALL-E 2, another AI tool released over the last year—and available now in beta. Unlike ChatGPT, which creates text-based content, DALL-E 2 generates images and art based on whatever description you enter. In case you’re wondering, it’s named after both the Spanish surrealist painter Salvador Dalí and the 2008 animated science fiction movie about a robot, “WALL-E.”

Some of the hype surrounding DALL-E 2 suggests that it will allow anyone, regardless of their visual art, photography, or design training, to create professional-grade images in a matter of seconds. But is that really the case? How reliable and adept is DALL-E 2 at meeting the visual content needs of associations and organizations in the nonprofit, government, education, and healthcare spaces? Is DALL-E 2 the panacea we’ve all been awaiting, or is it a tool to use with caution or altogether avoid?

Here, we share initial ideas and insight on what we think of OpenAI’s DALL-E 2 and how you might use it (or not) to support your needs.

How DALL-E 2 works

If you’re like us, you want to understand some of the science behind the new image generator before you start using it. Basically, DALL-E 2 builds on and improves OpenAI’s earlier version, DALL-E, creating visuals with four times the resolution of the previous version, according to OpenAI. The system utilizes mathematical systems known as neural networks to analyze and identify patterns in enormous data sets. One of the neural networks is a model created by OpenAI known as Contrastive Language-Image Pre-training, or CLIP. You can think of CLIP as the link between the language you use to describe what you want and the actual image(s) generated. It matches images to text prompts—and it’s critical to how DALL-E 2 functions.

If you’re interested in learning more of the science behind DALL-E 2, we recommend this guide by Ryan O’Connor, a developer educator who breaks down the ins and outs of the system. For now, though, we’ll move on to more practical matters.

Getting started

To start using DALL-E 2, you’ll first need to create an account, which simply involves entering a username and password. You’ll then receive 50 free credits to use to play around and create images with the system. After that, you’ll get 15 free credits a month—and will need to pay for anything over. For instance, 115 credits will cost you $15.

What does a credit amount to? One credit allows you to generate four variations of a work of art created by a single prompt. In other words, if you pay one credit and enter this prompt…

Create a photo of a diverse group of people networking at a conference

…you will get four versions of that image.

We tried that prompt, and here are the images we got, at the cost of one free credit:

What do you think?

In our opinion, these images lack a strong composition and look like sub-par stock photos that we would never use on a client’s site (or on our own site, for that matter). They’re like our results with ChatGPT—not terrible but also not of a professional quality.

We played around with the prompt, entering this variation:

Create a photo of authentic looking diverse individuals at a conference.

The results were even more disappointing, if not disturbing:

If you look closely, you can notice what looks like faces layered upon faces, as though the humans are amalgamations of various individuals. Some of the figures’ hands also appear warped and lacking in clarity. And the composition, again, lacks that of a trained photographer or graphic artist.

Next, we tried something outside of photographs, asking the image generator to create a standard medical illustration. Specifically, we asked it to:

Illustrate a coronary bypass in the style of Max Brodel.

These results were equally disappointing, demonstrating great difficulty working with a complex image and information:

The anatomy and accompanying text are flat-out inaccurate and lacking in clarity, violating the cardinal rules of visuals that break down complex information. Here, DALL-E 2 proves itself incapable of handling the multiple layers involved in this kind of work, which require the designer to consider such elements as audience, messaging, and focus to create an effective and informative visual.

We wanted to see, too, what DALL-E 2 could do with a more straightforward, less complex illustration, so we told it to:

Create a simple, modern icon representing patient care.

Here’s what we got:

As you can see, although the images are, indeed, minimalist and modern in style, they come across as strange and inappropriate, with the patient’s body cut in half or depicted oddly, along with the use of the cross hovering above.

With time and continual refinement of your prompts, you might have better luck. But our sense at this point is that it would take longer to train the AI to do what we need it to do than to design the icon ourselves. And while we always advocate for authentic photographs (of real people doing real things), when and if your time and budget allow for it, we’re sticking with stock photos for now as an alternative, instead of running to DALL-E 2.

That said, we’re still excited about DALL-E 2 and see it as a major milestone in AI and art generation that will improve and grow more sophisticated over time. We also think it could serve as a promising tool for assisting content creation and perhaps inspiring ideas—and for editing or adding features to existing images (another capability of DALL-E 2).

Right now, though, when it comes to using AI to assist your art and design work, you might be surprised to learn that AI is already part of some of the most popular design software. For example, Adobe Photoshop offers an AI-based Object Selection Tool that lets you easily change the background of a photo to suit your needs. This means you no longer have to make changes by hand with the masking, selection, or paint brush tool.

Another helpful AI feature in Photoshop is Content-Aware Fill, which lets you remove objects from photos. You can also alter photos to make them fit whatever space you need to fill. For instance, you might have a perfect image for your website, except that it’s not long enough. Now, instead of painting in a background to lengthen the photo with the cloning tool (which is tedious and challenging to make convincing!), Content-Aware Fill will do it for you.

Although the results are not always 100 percent perfect, Content-Aware Fill will get you 95 percent of the way there quickly, saving you time (and money!) in the end.

As we continue to play around with and consider various uses of DALL-E 2, we’re also cognizant and supportive of artists’ and writers’ rights—and staying abreast of some of the lawsuits against AI-based image generators underway now. We’re contemplating and discussing such topics as the ethical issues of appropriating someone else’s art (without giving credit), and whether the use of DALL-E 2, even as an “assistant,” helps or hinders the creative process.

We’ll keep you posted on our discussions and updated on what we discover. Meanwhile, if you’re looking for a partner to create engaging user experiences, know that we’re here for you. Reach out to start the conversation.

We're here for you.