Should we preserve the pre-AI internet before it is contaminated?


Wikipedia is already showing signs of a huge IA entrance
Serene lee / sopa / lightrocket images via Getty Images
The arrival of AI chatbots marks a historic division line, after which online equipment cannot be fully reliable to be created by humans, but how will people return to this change? While some are urgently working to archive the “unused” data from the pre-Ai era, others say that it is the outings of AA themselves that we must record, so that future historians can study how chatbots have evolved.
Rajiv Pant, an entrepreneur and former technology director among both The New York Times And The Wall Street Journalsaid he considers AI as a risk for information such as reports that are part of the historic file. “I am thinking of this problem” digital archeology “since the launch of Chatgpt, and it becomes more and more urgent each month,” explains Pant. “At present, there is no reliable way to distinguish the content of the author of the man from material generated by the AI on a large scale. It is not only an academic problem, this affects everything, from journalism to legal discovery to scientific research.”
For John Graham-Cumming in the Cloudflare cybersecurity company, the information produced before the end of 2022, when the chatpt launched, is similar to low level steel. This metal, merged before the Trinity nuclear bomb test on July 16, 1945, is appreciated for use in delicate scientific and medical instruments because it does not contain low radioactive contamination of the era of atomic weapons which creates noise in readings.
Graham-Cumming has created a website called LowBackGroundsteel.ai to archive data sources that have not been contaminated by AI, such as a complete download from Wikipedia from August 2022. Studies have already shown that Wikipedia is today showing signs of a huge IA contribution.
“There is a moment when we did everything ourselves, and at some point, we started to be increased considerably by these cat systems,” he said. “So the idea was to say – you can see it as a contamination, or you can see it as a kind of safe – you know, humans, we arrived here. And then after this point, we got additional help.”
Mark Graham directs the Wayback Machine at Internet Archive, a project that archive the public internet since 1996, says that it is skeptical about the effectiveness of any new effort to archive data, taking into account Internet archive stores up to 160 new information every day.
Rather than preserving the internet pre-Ai, Graham wants to start creating AI production archives for future researchers and historians. He has a plan to start asking 1000 news questions per day of chatbots and storing their answers. And because it is such a massive task, it will even use AI to do it: the AI recording the changing production of AI, for the curiosity of future humans.
“You ask him a specific question, then you get an answer,” explains Graham. “And then tomorrow, you ask him the same question and you will probably get a slightly different answer.”
Graham-Cumming quickly stresses that it is not anti-ai, and that the preservation of the information created by humans can in fact benefit the models of AI. Indeed, the production of low quality AI which is returned to the formation of new models can have a harmful effect, leading to what is called “collapse of the model”. Avoid this is an interesting business, he said.
“At one point, one of these AI will think of something that we have not thought.
Subjects:



