Wayback Machine’s archiving mysteriously plummets

The Internet Archive’s Wayback Machine is an invaluable resource that does exactly what the nonprofit name says: it archives the Internet. The Internet Archive is responsible for archiving around 500 million web pages per day.
However, a worrying change has been made to the platform in recent months. According to a new report from Niéman LaboratoryThe Internet Archive’s Wayback Machine archives some websites much less these days. Even more worrying: many of these websites are news-related.
According to the Neiman Lab report, the Wayback Machine archived 1.2 million snapshots from the home pages of 100 major news sites between January 1 and May 15, 2025. Suddenly, in mid-May, that changed.
The Wayback Machine took just 148,628 snapshots from the home pages of those same 100 news sites between May 17 and October 1, 2025. That represents a whopping 87% drop in the number of pages archived between the first four months of the year and the five months before.
The CNN homepage, for example, was archived by the Wayback Machine 34,524 times between January 1 and May 15. Since then, only 1,903 snapshots of the home page can be found in the Wayback Machine.
Crushable speed of light
The Internet Archive has just become an official federal library of the United States
Mashable reported in July that, thanks to a new designation from California Sen. Alex Padilla, the Internet Archive would join a network of more than 1,000 libraries across the country responsible for archiving government documents for public consumption.
Mark Graham, the director of the Wayback Machine, told Nieman Lab that “a hiatus in some specific archiving projects in May…resulted in the creation of fewer archives for some sites.” According to Graham, the index structure for some of the missing snapshots simply has not been built yet and would soon be added to the Wayback Machine archives.
As Nieman Lab pointed out, a five-month delay due to index issues is rare. According to Graham, the Internet Archive has experienced delays due to “various operational reasons” such as “resource allocation.” The Internet Archive has not clarified or provided more information to Nieman Lab about the issue.
Newspapers have long been archived for historical purposes. However, in the Internet age, most newspapers, except for traditional media giants, have recently been largely dearchived. News media websites have taken their place as historical documents. And since 1996, the Internet Archive has assumed responsibility for storing these archives of web pages.
However, the nonprofit has struggled in recent years. As Nieman Lab reports, Internet Archive’s expenses for 2023 were $32.7 million. It takes a lot of resources not only to crawl the Internet but also to store the data. The nonprofit only generated $23 million in revenue that same year.
Furthermore, Internet Archive was the victim last October of a huge data breach which took the site, along with the Wayback Machine, offline. It took weeks for the site to be fully restored.



