What is OpenAI’s DALL·E ? and why does it matter?
Note: I have received no compensation for writing this piece. Please consider supporting mine and others’ writing by becoming a Medium member with this link.
Introduction
We are going to talk about an AI prowess that has been in the news a lot lately. In a nutshell, OpenAI’s DALL·E is able to generate accurate images to match a provided text prompt. Our AI artist was named after the artist Salvador Dalí and Pixar’s WALL·E.
Throughout this article, we will learn about the following:
- DALL·E’s structure
- The way it was built
- Its ability to combine unrelated concepts in plausible ways
- The diverse text/vision capabilities of DALL·E
- Playing with DALL·E mini
- A discussion about general intelligence
What is DALL-E?
So, what is DALL·E ? In July, OpenAI’s GPT-3 was able to generate Op Eds, poems, sonnets and computer code. DALL·E is a 12-billion parameter version of the GPT-3 Transformer model that interprets natural language inputs (such as “a green leather purse shaped like a pentagon” or “an isometric view of a sad capybara”) and generates corresponding images.
In one example from OpenAI’s blog, the model renders images from the prompt “a living room with two white armchairs and a painting of the colosseum. The painting is mounted above a…