ChatGPT Images 2.0

882 points583 comments17 hours ago
lionkor

Every cent you spend on this, remember: The people who made this possible are not even getting a millionth of a cent for every billion USD made with it (they are getting nothing). Same with code; that code you spent years pouring over, fixing, etc. is now how these companies make so much money and get so much investment. It's like open source, except you get shafted.

show comments
minimaxir

So during my Nano Banana Pro experiments I wrote a very fun prompt that tests the ability for these image generation models to follow heuristics, but still requires domain knowledge and/or use of the search tool:

    Create a 8x8 contiguous grid of the Pokémon whose National Pokédex numbers correspond to the first 64 prime numbers. Include a black border between the subimages.

    You MUST obey ALL the FOLLOWING rules for these subimages:
    - Add a label anchored to the top left corner of the subimage with the Pokémon's National Pokédex number.
      - NEVER include a `#` in the label
      - This text is left-justified, white color, and Menlo font typeface
      - The label fill color is black
    - If the Pokémon's National Pokédex number is 1 digit, display the Pokémon in a 8-bit style
    - If the Pokémon's National Pokédex number is 2 digits, display the Pokémon in a charcoal drawing style
    - If the Pokémon's National Pokédex number is 3 digits, display the Pokémon in a Ukiyo-e style
The NBP result is here, which got the numbers, corresponding Pokemon, and styles correct, with the main point of contention being that the style application is lazy and that the images may be plagiarized: https://cdn.bsky.app/img/feed_fullsize/plain/did:plc:oxaerni...

Running that same prompt through gpt-2-image high gave an...interesting contrast: https://cdn.bsky.app/img/feed_fullsize/plain/did:plc:oxaerni...

It did more inventive styles for the images that appear to be original, but:

- The style logic is by row, not raw numbers and are therefore wrong

- Several of the Pokemon are flat-out wrong

- Number font is wrong

- Bottom isn't square for some reason

Odd results.

show comments
parasti

A great technical achievement, for sure, but this is kind of the moment where it enters uncanny valley to me. The promo reel on the website makes it feel like humans doing incredible things (background music intentionally evokes that emotion), but it's a slideshow of computer generatated images attempting to replicate the amazing things that humans do. It's just crazy to look at those images and have to consciously remind myself - nobody made this, this photographed place and people do not exist, no human participated in this photo, no human traced the lines of this comic, no human designer laid out the text in this image. This is a really clever amalgamation machine of human-based inputs. Uncanny valley.

show comments
simonw

I've been trying out the new model like this:

  OPENAI_API_KEY="$(llm keys get openai)" \
    uv run https://tools.simonwillison.net/python/openai_image.py \
    -m gpt-image-2 \
    "Do a where's Waldo style image but it's where is the raccoon holding a ham radio"
Code here: https://github.com/simonw/tools/blob/main/python/openai_imag...

Here's what I got from that prompt. I do not think it included a raccoon holding a ham radio (though the problem with Where's Waldo tests is that I don't have the patience to solve them for sure): https://gist.github.com/simonw/88eecc65698a725d8a9c1c918478a...

show comments
neom

Here is my regular "hard prompt" I use for testing image gen models:

"A macro close-up photograph of an old watchmaker's hands carefully replacing a tiny gear inside a vintage pocket watch. The watch mechanism is partially submerged in a shallow dish of clear water, causing visible refraction and light caustics across the brass gears. A single drop of water is falling from a pair of steel tweezers, captured mid-splash on the water's surface. Reflect the watchmaker's face, slightly distorted, in the curved glass of the watch face. Sharp focus throughout, natural window lighting from the left, shot on 100mm macro lens."

google drive with the 2 images: https://drive.google.com/drive/folders/1-QAftXiGMnnkLJ2Je-ZH...

Ran a bunch both on the .com and via the api, none of them are nearly as good as Nano Banana.

(My file share host used to be so good and now it's SO BAD, I've re-hosted with them for now I'll update to google drive link shortly)

show comments
swalsh

Been using the model for a few hours now. I'm actually reall impressed with it. This is the first time i've found value in an image model for stuff I actually do. I've been using it to build powerpoint slides, and mockups. It's CRAZY good at that.

show comments
bsenftner

My problem with all of this is the terrible educations everyone has, and they cannot discriminate images from art, nor art from communications, and if they had they would realize these points this entire debate hinges is a manipulation to create people that will not help themselves with the latest technologies. But to explain it causes people to get angry, because they either think I'm trying to manipulate them, or they fall in despair when they realize the magnitude of this crime.

madrox

This seems like a great time to mention C2PA, a specification for positively affirming image sources. OpenAI participates in this, and if I load an image I had AI generate in a C2PA Viewer it shows ChatGPT as the source.

Bad actors can strip sources out so it's a normal image (that's why it's positive affirmation), but eventually we should start flagging images with no source attribution as dangerous the way we flag non-https.

Learn more at https://c2pa.org

show comments
skybrian

This time it passed the piano keyboard test:

https://chatgpt.com/s/m_69e7ffafbb048191b96f2c93758e3e40

But it screwed up when attempting to label middle C:

https://chatgpt.com/s/m_69e8008ef62c8191993932efc8979e1e

Edit: it did fix it when asked.

show comments
justani

I have a few cases where nano banana fails all the time, even gpt image 2 is failing.

A 3 * 3 cube made out of small cubes, with a small 2 * 2 cube removed from it - https://chatgpt.com/share/69e85df6-5840-83e8-b0e9-3701e92332...

Create a dot grid containing a rectangle covering 4 dots horizontally and 3 dots vertically - https://chatgpt.com/share/69e85e4b-252c-83e8-b25f-416984cf30...

One where Nano banana fails but gpt image 2 worked: create a grid from 1 to 100 and in that grid put a snake, with it's head at 75 and tail at 31 - https://chatgpt.com/share/69e85e8b-2a1c-83e8-a857-d4226ba976...

show comments
porphyra

The improvement in Chinese text rendering is remarkable and impressive! I still found some typos in the Chinese sample pic about Wuxi though. For example the 笼 in 小笼包 was written incorrectly. And the "极小中文也清晰可读" section contains even more typos although it's still legible. Still, truly amazing progress. Vastly better than any previous image generation model by a large margin.

show comments
schneehertz

Generating a 4096x4096 image with gemini-3.1-flash-image-preview consumes 2,520 tokens, which is equivalent to $0.151 per image.

Generating a 3840x2160 image with gpt-image-2 consumes 13,342 tokens, which is equivalent to $0.4 per image.

This model is more than twice as expensive as Gemini.

show comments
TrackerFF

This is the first model I've used for mockups where I feed reference images, and they truly look real and good enough for pro use. I'm impressed.

Oarch

Every groundbreaking new AI release feels like a volley of cannonfire towards the soul. Oof.

show comments
dktp

One interesting thing I found comparing OpenAI and Gemini image editing is - Gemini rejects anything involving a well known person. Anything. OpenAI is happy to edit and change every time I tried

I have a sideproject where I want to display standup comedies. I thought I could edit standup comedy posters with some AI to fit my design. Gemini straight up refuses to change any image of any standup comedy poster involving a well know human. OpenAI does not care and is happy to edit away

show comments
amunozo

This is not as exciting as previous models were, but it is incredibly good. I am starting to think that expressing thoughts in words clearly is probably the most important and general skill of the future.

show comments
freakynit

Collection of some amazing prompts and corresponding images: https://gpt2-image-showcase.pagey.site/

Credits: https://github.com/magiccreator-ai/awesome-gpt-image-2-promp...

____tom____

No mention of modifying existing images, which is more important than anything they mentioned.

I think we all know the feeling of getting an image that is ok, but needs a few modifications, and being absolutely unable to get the changes made.

It either keeps coming up with the same image, or gives you a completely new take on the image with fresh problems.

Anyone know if modification of existing images is any better?

Anything better that OpenAI?

show comments
AltruisticGapHN

This is insanely good. But wow, prompting to get any one of these images is way more complicated than prompting Claude Code. There is a ton of vocabulary that comes with it relating to the camera, the lighting, the mood etc.

mercacona

Every improvement in image generation seems to reduce the value of the images themselves. When anything can be faked or created in seconds, what is an image really worth? With text or code, you can dig into a meaningful dialogue because their reality is digital too. But images become like the plain people to show up photo frames.

I guess it's just a completely personal feeling.

sanex

Having the launch website just scrollable generated images is so slick. I love this.

show comments
overgard

Pretty mixed feelings on this. From the page at least, the images are very good. I'd find it hard to know that they're AI. Which I think is a problem. If we had a functioning congress, I wonder if we might end up with legislation that these things need to be watermarked or otherwise made identifiable as AI generated..

I also don't like that these things are trained on specific artist's styles without really crediting those artists (or even getting their consent). I think there's a big difference between an individual artist learning from a style or paying it homage, vs a machine just consuming it so it can create endless art in that style.

show comments
super256

I tried using it for creating 2D logos, which many tools suck at (except mid journey).

Looks like ChatGPT Images 2 is now good at this too!

squidsoup

Are camera manufacturers working on signed images? That seems like the only way our trust in any digital media doesn't collapse entirely.

show comments
lossyalgo

Someone remind me again why this is a good idea to be able to create perfect fake images?

show comments
bensyverson

I caught the last minute of this—was it just ChatGPT Images 2.0?

show comments
rambojohnson

Just tried it and got six fingers and half a thumb on a simple portrait. Mickey Mouse stuff.

thelucent

It seems to still have this gpt image color that you can just feel. The slight sepia and softness.

show comments
nickandbro

200+ points in Arena.ai , that's incredible. They are cleaning house with this model

show comments
hahahacorn

One of the images in the blog (https://images.ctfassets.net/kftzwdyauwt9/4d5dizAOajLfAXkGZ7...) is a carbon copy of an image from an article posted Mar 27, 2026 with credits given to an individual: https://www.cornellsun.com/article/2026/03/cornell-accepts-5...

Was this an oversight? Or did their new image generation model generate an image that was essentially a copy of an existing image?

show comments
JimsonYang

> you can make your own mangas

No you can’t.

You still have the studio ghibili look from the video. The issue of generating manga was the quality of characters, there’s multiple software to place your frame.

But I am hopeful. If I put in a single frame, can it carry over that style for the next images? It would be game changing if a chat could have its own art style

Oras

My test for image models is asking it to create an image showing chess openings. Both this model and Banana pro are so bad at it.

While the image looks nice, the actual details are always wrong, such as showing pawns in wrong locations, missing pawns, .. etc.

Try it yourself with this prompt: Create a poster to show opening game for Queen's Gambit to teach kids to play chess.

show comments
c16

That video seems like it was made for the tiktok generation. Slow down.

rambojohnson

Just tried it and got the usual six fingers, and half a thumb. What are they actually iterating on with these models by now…

baalimago

"Benchmarks" aside, do anyone actually use these image models for anything?

show comments
mrzhangbo

I'm exhausted. I've developed many products, but most of them were abandoned halfway through.

codebolt

Anyone test it out for generating 2D art for games? Getting nano banana to generate consistent sprite sheets was seemingly impossible last time i tried a few months ago.

RigelKentaurus

If every single image on their blog was generated by Images 2.0 (I've no reason to believe that's not the case), then wow, I'm seriously impressed. The fidelity to text, the photorealism, the ability to show the same character in a variety of situations (e.g. the manga art) -- it's all great!

platinumrad

Why do all of the cartoons still look like that? Genuinely asking.

show comments
elAhmo

I am super out of the loop here, what happened with Dall-E?

PDF_Geek

The free tier for ChatGPT feels pretty much nerfed at this point. I’m barely getting 10 prompts in before it drops me down to the basic model. The restrictions are getting ridiculous. Is anyone else seeing this?

modeless

Can it generate transparent PNGs yet?

show comments
tezza

I've rushed out my standardised quality check images for gpt-image-2:

https://generative-ai.review/2026/04/rush-openai-gpt-image-2...

I've done a series over all the OpenAI models.

gpt-image-2 has a lot more action, especially in the Apple Cart images.

VA1337

So is it better than nano-banana after all?

jumploops

Looks like analog clocks work well enough now, however it still struggles with left-handed people.

Overall, quite impressed with its continuity and agentic (i.e. research) features.

souravroy78

Cool!

mvkel

I wonder if this confirms version 1 of some kind of "world model."

It has an unprecedented ability to generate the real thing (for example, a working barcode for a real book)

naseemali925

Its amazingly good at creating UI mockups. Been trying this to create UI mockups for ideas.

aledevv

Only vintage-style images?

franze

the tragedy of image generating ai is that it is used to massively create what already exists instead of creating something truly unique - we need ai artists - and yeah, they will not be appreciated

show comments
etothet

I would love to see prompt examples that created the images on the announcement page.

show comments
vunderba

I decided to run gpt-image-2 on some of the custom comics I’ve come up with over the years to see how well it would do, since some of them are pretty unusual. Overall, I was quite impressed with how faithful it adhered to the prompts given that multi-panel stuff has to maintain a sense of continuity.

Was surprised to see it be able to render a decent comic illustrating an unemployed Pac-Man forced to find work as a glorified pie chart in a boardroom of ghosts.

https://mordenstar.com/other/gpt-2-comics

muyuu

I wonder if this will be decent at creating sprite frame animations. So far I've had very poor results and I've had to do the unthinkable and toil it out manually.

show comments
james2doyle

In the next round of ChatGPT advertisements, if they don’t use AI generated images, then that means they don’t believe in their own product right?

lifeisstillgood

Pretty much all of the kerfuffle over AI would go away of it was accurately priced.

After 2008 and 2020 vast (10s of trillions) amounts of money has been printed (reasonably) by western gov and not eliminated from the money supply. So there are vast sums swilling about - and funding things like using massively Computationally intensive work to help me pick a recipie for tonight.

Google and Facebook had online advertising sewn up - but AI is waaay better at answering my queries. So OpenAI wants some of that - but the cost per query must be orders of magnitude larger

So charge me, or my advertisers the correct amount. Charge me the right amount to design my logo or print an amusing cat photo.

Charge me the right cost for the AI slop on YouTube

Charge the right amount - and watch as people just realise it ain’t worth it 95% of the time.

Great technology - but price matters in an economy.

kanodiaayush

It stands out to me that this page itself is wonderful to go through (the telling of the product through model generated images).

fizlebit

Scrolling through those images it just feels like intellectual theft on a massive scale. The only place I think you're going to get genuinely new ideas is from humans. Whether those humans use AI or not I don't care, but the repetitive slop of AI copying the creative output of humans I don't find that interesting. Call me a curmudgeon. I guess humans also create a lot of derivative slop even without AI assistance. If this leads somehow to nicer looking user interfaces and architecture maybe that is good thing. There are a lot of ugly websites, buildings and products.

dakiol

> On the flip side, there are hundreds of ways that these tools cause genuine harm, not just to individuals but to entire systems.

Yeah, agree. I think it's the first time I'm asking myself: Ok, so this new cool tech, what is it good for? Like, in terms of art, it's discarded (art is about humans), in terms of assets: sure, but people is getting tired of AI-generated images (and even if we cannot tell if an image is AI-generated, we can know if companies are using AI to generate images in general, so the appealing is decreasing). Ads? C'mon that's depressing.

What else? In general, I think people are starting to realize that things generated without effort are not worth spending time with (e.g., no one is going to read your 30-pages draft generated by AI; no one is going to review your 500 files changes PR generated by AI; no one is going to be impressed by the images you generate by AI; same goes for music and everything). I think we are gonna see a Renaissance of "human-generated" sooner rather than later. I see it already at work (colleagues writing in slack "I swear the next message is not AI generated" and the like)

show comments
jcattle

Can we talk about how jarring the announcement video is?

AI generated voice over, likely AI generated script (You see, this model isn't just generating images, it's thinking!). From what it looks like only the editing has some human touch to it?

It does this Apple style announcement which everyone is doing, but through the use of AI, at least for me, it falls right into the uncanny valley.

agnishom

I don't know how this benefits humanity. In what way was ChatGPT Images 1.0 not already good enough? Perhaps some new knowledge was created in the process?

cyberjunkie

Looks like AI and I look away from any image generated by a LLM. It's my easy internal filter to weed out everything that isn't art.

Melatonic

Can it generate anything high resolution at increased cost and time? Or is it always restricted?

jwpapi

Why is it all so asian?

show comments
XCSme

Oh wow, scrolling through the page on mobile makes me dizzy

StefanBatory

Do you think those working at ChatGPT have ever wondered how they are contributing to dismantling democracy and ensuring nothing is true by now? The ultimate technological postmodernism.

show comments
dazhbog

Yay, let's burn the planet computing more slopium..

RyanJohn

Oh my god, it's very nice!

BohdanPetryshyn

Am I the only one for whom videos in OpenAI releases never load? Tried both Chrome and Safari

tomchui157

Img2+ seed dance 2 = image AGI

bitnovus

great obfuscation idea - hidden message on a grain of rice

apparent

I find the video to be very annoying. Am I supposed to freeze frame 4x per second to be able to see whether the images are actually good? I've never before felt stressed watching a launch video.

show comments
ibudiallo

And here I was proud of myself, having taught my mom and her friends how to discern real from fakes they get on WhatsApp groups. Another even more powerful tool for scammers. I'm taking a break.

show comments
gfody

there's something funny going on with the live stream audio

szmarczak

Wow, the difference between AI and non-AI images collapses. I hate the future where I won't be able to tell the difference.

show comments
dahuangf

good job

rzgrozt

now that's a good work since it's openai

bitnovus

No gpt-5.5

show comments
mcfry

How hard is it to have a video player with a fucking volume toggle?

dzonga

for video game assets this is massive.

but in general though - will people believe in anything photographic ?

imagine dating apps, photographic evidence.

I'm guessing we're gonna reach a point where - you fuck up things purposely to leave a human mark.

show comments
andai

lol at the fake handwritten homework assignment. Know your customer!

davikr

It definitely lost the characteristic slop look.

OutOfHere

ChatGPT image generation is and has been horrific for the simple reason that it rejects too many requests. This hasn't changed with the new model. There are too many legal non-adult requests that are rejected, not only for edits, but also for original image generation. I'd rather pay to use something that actually works.

irishcoffee

This is so stupid. As a free OSS tool it’s amazing. Paying money for this is fucking stupid. How blind are we all to now before this tech?

rqa129

Can it generate Chibi figures to mask the oligarchy's true intentions on Twitter and make them more relatable?

volkk

the guys presenting are probably all like 25x smarter than I am but good god, literally 0 on screen presence or personality.

show comments
minimaxir

HN submission for a direct link to the product announcement which for some reason is being penalized by the HN algorithm: https://news.ycombinator.com/item?id=47853000

show comments
simonw

Suggest renaming this to "OpenAI Livestream: ChatGPT Images 2.0"

show comments
sho_hn

In 5 years and 3 months between DALL-E and Images 2.0 we've managed to progress from exuberant excitement to jaded indifference.

show comments
welder

Introducing DeepFakes 2.0 /s

zb3

Image generation? Hmm, would be cool if OpenAI also made a video-generation model someday..

show comments
biosubterranean

Oh no.

ai4thepeople

Each day when my AI girlfriend wakes me up and shows me the latest news, I feel: This is it! We are living in a revolution!

Never before in history did humanity have the possibility of seeing a picture of a pack of wolves! The dearth of photographs has finally been addressed!

I told my AI girlfriend that I will save money to have access to this new technology. She suggested a circular scheme where OpenAI will pay me $10,000 per year to have access to this rare resource of 21th century daguerreotype.

green_wheel

Well artists, you guys had a good run thank you for your service.

manishfp

Goated release tbh. The text work inside the images are nice

aliljet

I am hopeful that OpenAI will potentially offer clarity on their loss-leading subscription model. I'd prefer to know the real cost of a token from OpenAI as opposed to praying the venture-funded tokens will always be this cheap.

prvc

I hope they will consider releasing DALL-E 2 publicly, now that there has been so much progress since it was unveiled. It had a really nice vibe to it, so worth preserving.

show comments
tkgally

I had it produce a two-page manga with Japanese dialogue. Nearly perfect:

https://www.gally.net/temp/20260422-chatgpt-images-2-example...

Danox

Sam Altman in his meeting with Tim Cook two and a half years ago give me money. I think it’ll take $150 billion dollars, Tim Cook well here’s what we’re going to do, this is what I think it’s worth…

Later Google tried the same thing, Apple we will give you a $1 billion dollar a year refund, what’s changed in two and a half years?