Friday, June 5, 2026
HomeTechnologyGoogle's latest AI model report lacks key safety details, experts say

Google’s latest AI model report lacks key safety details, experts say


On Thursday, weeks after launching its most powerful AI model yet, Gemini 2.5 Pro, Google published a technical report showing the results of its internal safety evaluations. However, the report is light on the details, experts say, making it difficult to determine which risks the model might pose.

Technical reports provide useful โ€” and unflattering, at times โ€” info that companies donโ€™t always widely advertise about their AI. By and large, the AI community sees these reports as good-faith efforts to support independent research and safety evaluations.

Google takes a different safety reporting approach than some of its AI rivals, publishing technical reports only once it considers a model to have graduated from the โ€œexperimentalโ€ stage. The company also doesnโ€™t include findings from all of its โ€œdangerous capabilityโ€ evaluations in these write-ups; it reserves those for a separate audit.

Several experts TechCrunch spoke with were still disappointed by the sparsity of the Gemini 2.5 Pro report, however, which they noted doesnโ€™t mention Googleโ€™s Frontier Safety Framework (FSF). Google introduced the FSF last year in what it described as an effort to identify future AI capabilities that could cause โ€œsevere harm.โ€

โ€œThis [report] is very sparse, contains minimal information, and came out weeks after the model was already made available to the public,โ€ Peter Wildeford, co-founder of the Institute for AI Policy and Strategy, told TechCrunch. โ€œItโ€™s impossible to verify if Google is living up to its public commitments and thus impossible to assess the safety and security of their models.โ€

Thomas Woodside, co-founder of the Secure AI Project, said that while heโ€™s glad Google released a report for Gemini 2.5 Pro, heโ€™s not convinced of the companyโ€™s commitment to delivering timely supplemental safety evaluations. Woodside pointed out that the last time Google published the results of dangerous capability tests was in June 2024 โ€” for a model announced in February that same year.

Not inspiring much confidence, Google hasnโ€™t made available a report for Gemini 2.5 Flash, a smaller, more efficient model the company announced last week. A spokesperson told TechCrunch a report for Flash is โ€œcoming soon.โ€

โ€œI hope this is a promise from Google to start publishing more frequent updates,โ€ Woodside told TechCrunch. โ€œThose updates should include the results of evaluations for models that havenโ€™t been publicly deployed yet, since those models could also pose serious risks.โ€

Google may have been one of the first AI labs to propose standardized reports for models, but itโ€™s not the only one thatโ€™s been accused of underdelivering on transparency lately. Meta released a similarly skimpy safety evaluation of its new Llama 4 open models, and OpenAI opted not to publish any report for its GPT-4.1 series.

Hanging over Googleโ€™s head are assurances the tech giant made to regulators to maintain a high standard of AI safety testing and reporting. Two years ago, Google told the U.S. government it would publish safety reports for all โ€œsignificantโ€ public AI models โ€œwithin scope.โ€ The companyย followed up that promise with similar commitmentsย toย other countries, pledging to โ€œprovide public transparencyโ€ around AI products.

Kevin Bankston, a senior adviser on AI governance at the Center for Democracy and Technology, called the trend of sporadic and vague reports a โ€œrace to the bottomโ€ on AI safety.

โ€œCombined with reports that competing labs like OpenAI have shaved their safety testing time before release from months to days, this meager documentation for Googleโ€™s top AI model tells a troubling story of a race to the bottom on AI safety and transparency as companies rush their models to market,โ€ he told TechCrunch.

Google has said in statements that, while not detailed in its technical reports, it conducts safety testing and โ€œadversarial red teamingโ€ for models ahead of release.



Source link

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments

Translate ยป