OpenAI has introduced the advanced image generation capabilities of its GPT-4o model. The new model offers significantly improved precision, detail, and realism compared to previous versions. Users can now generate, edit, and modify existing images using simple text prompts.
GPT-4o brings notable improvements in creating complex and multi-object images. The model can seamlessly integrate 10 to 20 different objects into a single coherent visual. Additionally, its ability to accurately render text and symbols enhances the production of informative graphics such as logos, diagrams, and infographics.

Examples shared by OpenAI include whiteboard meeting notes, comic book illustrations, detailed scientific infographics, and text-supported visuals. The model is designed not just as an aesthetic tool but also as a powerful solution for information sharing and communication.

A key feature of GPT-4o is its step-by-step image refinement capability. Users can iteratively refine their generated images through natural conversation, enabling a structured and interactive design process. For instance, a game character can be gradually adjusted while maintaining visual consistency at every stage.

The model also analyzes uploaded images to generate new visuals based on them, making the image creation process more intuitive and personalized. GPT-4o is capable of producing diverse styles, from photorealistic images to artistic renderings, while ensuring high-quality transformations.

OpenAI acknowledges that the model still has some limitations. It may struggle with small-text-heavy graphics or multilingual images, and occasional inconsistencies or unintended cropping can occur. The company has stated that further improvements will be made in these areas.

To ensure responsible use, OpenAI has implemented various safety measures. All images generated by GPT-4o include C2PA metadata, which verifies that the content was created by OpenAI. Additionally, requests for harmful content are automatically blocked.
Starting today, GPT-4o’s image generation capabilities are available by default for ChatGPT Plus, Pro, Team, and free users. Enterprise and Edu users will gain access soon. Developers will also be able to leverage these features through API integration in the coming weeks. Meanwhile, DALL·E users can continue accessing OpenAI’s dedicated DALL·E GPT for their image generation needs.