What is OpenAI’s DALL·E ? and why does it matter?

Chouaieb Nemri
5 min readMay 3, 2022

Note: I have received no compensation for writing this piece. Please consider supporting mine and others’ writing by becoming a Medium member with this link.

Introduction

We are going to talk about an AI prowess that has been in the news a lot lately. In a nutshell, OpenAI’s DALL·E is able to generate accurate images to match a provided text prompt. Our AI artist was named after the artist Salvador Dalí and Pixar’s WALL·E.

Throughout this article, we will learn about the following:

  • DALL·E’s structure
  • The way it was built
  • Its ability to combine unrelated concepts in plausible ways
  • The diverse text/vision capabilities of DALL·E
  • Playing with DALL·E mini
  • A discussion about general intelligence

What is DALL-E?

So, what is DALL·E ? In July, OpenAI’s GPT-3 was able to generate Op Eds, poems, sonnets and computer code. DALL·E is a 12-billion parameter version of the GPT-3 Transformer model that interprets natural language inputs (such as “a green leather purse shaped like a pentagon” or “an isometric view of a sad capybara”) and generates corresponding images.

In one example from OpenAI’s blog, the model renders images from the prompt “a living room with two white armchairs and a painting of the colosseum. The painting is mounted above a…

--

--