How ChatGPT’s New Image Generator Stacks Up Against Gemini’s Nano Banana Pro

Following major image editing upgrades added to Google Gemini in August, under the fancy codename Nano Banana, it’s OpenAI’s turn to supercharge the tools you have for image manipulations in ChatGPT. The new update is called GPT Image 1.5 and is now rolling out to all users.
One of the main improvements here, as was the case with Nano Banana, is how ChatGPT can now edit a specific part of an image while keeping everything else consistent. You can add or remove something, or change the color or style of something, without ending up with a completely different image.
Another feature ChatGPT has now borrowed from Gemini: the ability to combine multiple images together into a single scene. Want you and your best friend in front of the Sydney Harbor Bridge? No problem: just provide the source images and the AI will do the rest. You can also change visual styles while maintaining consistent details.
OpenAI claims that the new image editor and generator is able to follow instructions “more reliably” and render images up to four times faster than before. Text can be more varied in terms of style and size, and images should be more realistic and error-free in general, although OpenAI also admits that there is still room for improvement.
It’s the best image generator tool we’ve ever seen in ChatGPT, and it all looks impressive at first glance, but how does it compare in practice to Gemini and Nano Banana? I tested both models via the $20 per month plan on both platforms (namely ChatGPT Plus and Google AI Pro, respectively) to see how they compared.
Rendering and editing images
Open ChatGPT on web or mobile and you will see that there is a new Images tab in the left navigation pane. This takes you to a library of your existing images, as well as new prompts for creating images. You get suggested prompts, as well as an assortment of premade portrait image styles that you can apply.
A journalist, lamp and countryside scene courtesy of Gemini.
Credit: Gemini
A reporter, lamp and countryside scene courtesy of ChatGPT.
Credit: ChatGPT
I tested the new GPT Image 1.5 template by asking ChatGPT to generate a busy tech reporter, a lamp in the middle of an empty warehouse, and a rolling landscape of hills in fog, cartoon style. I then asked Gemini to create the same images with the same prompts. Although the results were quite varied, in terms of quality and realism they were pretty even – some occasional issues with weird physics and repetition, but nothing too serious.
ChatGPT and Gemini are now also very proficient at sharp image editing: both AI bots easily replaced the reporter’s clothes with a shirt and tie without touching any other part of the image. This would have taken a lot of time to do manually, even by a Photoshop expert, and shows how transformative AI imaging is becoming.
The color changes were all handled with aplomb, but the AIs struggled a bit with perspective changes, where I asked to see the same shot from another angle. In these cases, the instructions were less well followed and the images less consistent (because new areas had to be rendered), although ChatGPT did a little better than Gemini in achieving good results.
Clothes can now be swapped in seconds (Gemini edition).
Credit: Gemini
Clothes can now be swapped in seconds (ChatGPT edition).
Credit: ChatGPT
The classic “remove an object from this image” challenge was met with aplomb: Gemini and ChatGPT managed to remove a countryside cottage with surgical precision, leaving everything else intact. Again, these are the kind of tedious image edits that would previously have required a lot of painstaking effort, and can now be done in seconds.
What do you think of it so far?
Gemini’s attempt to remove a cottage.
Credit: Gemini
Attempt by ChatGPT to delete a chalet.
Credit: ChatGPT
Combine and remix images
Another talent that ChatGPT and Gemini now possess is being able to combine images with each other. So you can have separate photos of you and your parents, put them together in the same photo, and then add a background wherever you want. You can get perfect family photos without having to gather your loved ones or travel.
This is an area where Gemini and ChatGPT struggled a bit more: the editing dexterity was always impressive, but the results didn’t always resemble a single, cohesive scene. The lighting is sometimes off, or elements of different images appear at different scales, and you’ll need to do a bit more adjusting, editing, and re-prompting to get it right.
ChatGPT did a little better at mixing different images and elements and changing the overall appearance of an image. When I tried to get the AIs to blend all of my frames into a moody film noir shot, ChatGPT produced something pretty consistent: Gemini’s effort looked much more like a copy-and-paste job.
It can be fun to remix photos over and over again – adding new people, changing the weather, moving the location – and these two robots are now capable of achieving some pretty incredible results. Remixing photos of family and friends will be popular, but it’s not that simple: With people you know, any added generative AI tends to look fake, because neither ChatGPT nor Gemini knows exactly what those people look like, how they smile, how they are built, or how they tend to stand or sit.
Gemini can combine images, but they look like different images.
Credit: Gemini
ChatGPT did a better job creating a new image that looked correct.
Credit: ChatGPT
When it comes to ChatGPT and Gemini, they are both now at a high level, one that puts advanced Photoshop-style editing capabilities within everyone’s reach. If either AI model has the edge right now, it’s ChatGPT, but there’s not much in it. It will also be fascinating to see where these image editing capabilities go next.



