AI advance helps astronomers spot cosmic events with just a handful of examples


The same transient is shown in three surveys, with lines corresponding to Pan-STARRS (top), MeerLICHT (middle), and ATLAS (bottom). Each row presents, from left to right, the images New, Reference and Difference. Red circles mark the expected position of the transient candidate at the center of each buffer. All stamps are 100 × 100 pixels, but their angular sky coverage differs due to survey-specific pixel scales: Pan-STARRS 0.25″/pixel, MeerLICHT 0.56″/pixel, and ATLAS 1.86″/pixel. Credit: Natural astronomy (2025). DOI: 10.1038/s41550-025-02670-z
A new study co-led by the University of Oxford and Google Cloud has shown how general-purpose AI can accurately classify real-world changes in the night sky (such as an exploding star, a black hole tearing apart a passing star, a fast-moving asteroid, or a brief stellar flare from a compact star system) and explain its reasoning, without the need for complex training.
Published in Natural astronomyThe study by researchers at the University of Oxford, Google Cloud and Radboud University demonstrates that a general-purpose large language model (LLM) – Google’s Gemini – can be transformed into an expert astronomy assistant with minimal guidance.
Using just 15 sample images and a simple set of instructions, Gemini learned to distinguish real cosmic events from imaging artifacts with about 93 percent accuracy. Importantly, AI also provided a plain English explanation for each classification, an important step toward more transparent and reliable AI-based science, and toward creating accessible tools that do not require massive training datasets or deep expertise in AI programming.
“It is striking that a handful of examples and clear instructions can provide such precision,” said co-lead author Dr Fiorenzo Stoppa from the Department of Physics at the University of Oxford. “This allows a wide range of scientists to develop their own classifiers without deep expertise in training neural networks, but without the desire to create one.”
“As someone with no formal training in astronomy, this research is incredibly exciting,” said Google Cloud co-lead author Turan Bulmus. “This demonstrates how general-purpose LLMs can democratize scientific discovery, allowing any curious person to contribute meaningfully to areas in which they may not have traditional experience. This is a testament to the power of accessible AI to break down barriers in scientific research.”
Rare signals in a world of noise
Modern telescopes scan the sky tirelessly, generating millions of alerts about potential changes every night. Although some of these are genuine discoveries, such as exploding stars, the vast majority are “false” signals caused by satellite tracks, cosmic ray impacts, or other instrumental artifacts.
Traditionally, astronomers rely on specialized machine learning models to filter this data. However, these systems often operate as a “black box”, providing a simple “real” or “fake” label without explaining their logic. This requires scientists to either blindly trust the results or spend countless hours manually checking thousands of candidates, a task that will become impossible with the next generation of telescopes such as the Vera C. Rubin Observatory, which will produce about 20 terabytes of data every 24 hours.
The research team asked a simple question: Could a general-purpose multimodal AI like Gemini, designed to understand text and images together, not only match the accuracy of specialized models, but also explain what it sees?
The team provided the LLM with just 15 labeled examples for each of the three main sky surveys (ATLAS, MeerLICHT and Pan-STARRS). Each example included a small image of a new alert, a reference image of the same part of the sky, and a “difference” image highlighting the change, along with a brief expert note. Guided only by these few examples and concise instructions, the model then classified thousands of new alerts, providing a label (real/fake), a priority score, and a short, readable description of its decision.

Gemini uses three images per candidate: New (the latest scientific frame showing the putative transient), Reference (an earlier or stacked model of the same part of the sky), and Difference (New minus Reference, isolating any transient signal). From each triplet, Gemini returns three results: (1) a real/fake classification (astrophysical source or artifact), (2) a concise textual explanation describing the main features of the image and the reasoning behind the decision, and (3) an interest score indicating tracking priority for rapid reporting to astronomers. Credit: Natural astronomy (2025). DOI: 10.1038/s41550-025-02670-z
A human in the know: an AI that knows when to ask for help
A key part of the study was to check the quality and usefulness of the AI explanations. The team assembled a panel of 12 astronomers to review the AI descriptions, who found them to be very consistent and useful.
Additionally, in a parallel test, the team asked Gemini to review its own answers and assign a consistency score to each. They found that model confidence was a strong indicator of its accuracy: results with poor consistency were much more likely to be incorrect. This ability to self-assess is essential to creating a reliable, “human in the loop” workflow. By automatically flagging its own uncertain cases for human review, the system can focus astronomers’ attention where it is most needed.
Using this self-correction loop to refine initial examples, the team improved the model’s performance on a dataset from ~93.4% to ~96.7%, demonstrating how the system can learn and improve in partnership with human experts.
Co-author Professor Stephen Smartt (Department of Physics, University of Oxford) said: “I have been working on this problem of rapid processing of data from sky surveys for over 10 years, and we are constantly faced with eliminating real events from false signals in data processing. We have spent years training machine learning models, neural networks, to do image recognition.
“However, the LLM’s accuracy in recognizing sources with minimal guidance rather than task-specific training was remarkable. If we can scale this, it could be a total game-changer in the field, another example of AI enabling scientific discovery.”
What’s next?
The team sees this technology as the basis for autonomous “assistant agents” in science. Such systems could do much more than classify a single image; they could integrate multiple data sources (like images and brightness measurements), check their own confidence, autonomously request follow-up observations from robotic telescopes, and transmit only the most promising and unusual findings to human scientists.
Because the method requires only a small set of examples and plain language instructions, it can be quickly adapted to new scientific instruments, investigations, and research goals in different fields.
“We are entering an era where scientific discovery is accelerated not by black box algorithms, but by transparent AI partners,” said Turan Bulmus, co-lead author of Google Cloud.
“This work shows a path to systems that learn with us, explain their reasoning, and allow researchers in any field to focus on what matters most: asking the next big question.”
More information:
Textual interpretation of transient image classifications from large language models, Natural astronomy (2025). DOI: 10.1038/s41550-025-02670-z
Provided by the University of Oxford
Quote: Advancement in AI helps astronomers spot cosmic events with just a handful of examples (October 8, 2025) retrieved October 8, 2025 from https://phys.org/news/2025-10-ai-advance-astronomers-cosmic-events.html
This document is subject to copyright. Except for fair use for private study or research purposes, no part may be reproduced without written permission. The content is provided for informational purposes only.




