GPT Image 1 is OpenAI's first natively multimodal image model, originally released in March 2025 as part of GPT-4o. Unlike the DALL-E series that relied on diffusion-based generation, it uses an autoregressive architecture that enables both text-to-image creation and image-to-image transformation within a single unified model. It marked a significant leap in text rendering accuracy, photorealism, and contextual prompt understanding, and went viral upon release for its ability to generate Studio Ghibli-style imagery.
GPT Image 1 is OpenAI's first natively multimodal image model, originally released in March 2025 as part of GPT-4o. Unlike the DALL-E series that relied on diffusion-based generation, it uses an autoregressive architecture that enables both text-to-image creation and image-to-image transformation within a single unified model. It marked a significant leap in text rendering accuracy, photorealism, and contextual prompt understanding, and went viral upon release for its ability to generate Studio Ghibli-style imagery.