I Tested AI ‘Humanizers’ to See How Well They Actually Disguise AI Writing

Artificial intelligence (AI) cannot do everything (or at least it cannot do everything GOOD), but one thing of generative AI tools using major language models are Very good to create text. If you have bombed the verbal part of the SAT test and write something more than a text is terrifying, all experience may seem quite magical; Being able to generate an email, an essay or a cover letter without having to look at a blank page for hours and worry about each vocabulary choice is a powerful tool. This is why it is estimated that almost 20% of adults in the United States used AI to write emails or trials.
Once this email or essay is polite (and makes facts, right?), However, there is an imminent obstacle: AI detectors, ranging from humans being aware of the “races” behind the writing generated by the AI to online tools that claim to scan text and identify if it was written by human beings or AI. The accuracy of these detectors is doubtful not Written by AI.
Enter the “humanizer” of AI, a tool designed to take your copy of AI and do something, well, more human by removing and reformulating AI and common sentences. It is an attractive idea: you get AI to generate your test, you pass it through the humanizer, and the end result seems to be written from zero by a human (probably, you). But do they work?
The test
To find out, I conducted a little experience. Although this is not exactly an exhaustive survey, it certainly gave me a solid feeling to know if one of these tools deserves to be used if you insist that AI secretly writes all your correspondence, your school assignments or your sincere emails to old friends.
First of all, I had created an essay on … how to make writing ai more humanized. He turned a test in a few seconds, and the result was perfectly consistent. I did not verify it or in no way massage the text; His only goal is to be tested in humanization tools.
Then, I traveled the test through a few AI detectors to make sure it was a fine example of poor AI writing. The results were as expected: Quillbot noted it at 94% IA, Zerogpt noted it at 97% and Copyleaks obtained a robust 100% generated by AI. The world of AI detectors agreed: this chatgpt test can be read as if it has been written by Chatgpt.
The results
Could the AI Humanizer tools solve this problem? There are a lot of humanisers there – the explosion of IA chatbots inspired a war between detectors and the tools designed to deceive them. So I chose some popular to test.
First of all, however, I wanted a little more calibration, so I did something obvious: I made the text of Chatgpt in it and I asked he To humanize the text. All these tools are based on AI, after all, so maybe the simplest thing in the world is to ask Chatgpt to be less like itself.
What do you think so far?
Then, I took the original text generated by Chatgpt and fed it with four other humanizer tools: paraphraser.io, stealthwriter, grammar and gpthuman.
Now, I had five “humanized” versions of a test that three AI detectors had marked the AI as obviously. Would their scores improve? The answer is almost no, although a tool has shown how you could generously call “promise”:
-
Paraphrase.io:: Was murdered. Quillbot marked its 83% version generated by AI, from Copyleaks to a fairly firm 100% and a zerogpt at 99.94% with a specific specific.
-
Chatgpt: Bombed, although to be fair, it is not specifically a humanizer, and perhaps a more in-depth prompt would have given better results. Quillbot and Copyleaks noted it 100% AI-Gen, while Zerogpt gave it 87.77%.
-
Grammar: Also bombed in a fairly in -depth, Quillbot, Copyleaks and Zerogpt marking its 99%version, 97.1%and 99.97%respectively.
-
GPTHUMAN: He had mixed results. Quillbot was completely fooled, the marking at 0% AI-Gen, and Zerogpt was not sure of himself, marking him only 60.96%. But Copyleaks had no doubt, slapping him with a 100%score.
-
Stealth: The most effective tested here. While Zerogpt was suspicious, the striking as (once again, curiously specific) 64.89% AI-Gen, Copyleaks marked it only 3% and Quillbot was completely fooled with a score of 0%.
One aspect of Stealthwriter which may have helped its effectiveness was the ability to continue to operate the humanizer through the text over and over. The first race, Stealthwriter said it would mark 65% human, so I ran it a second time, and the scoring jumped in the 80s, so I did it again, and it reached 95%. After that, the score did not move when I ran the humanizer tool on the text.
All these tools clearly indicate that you should review the results and make your own adjustments, and I have not examined the humanized text for the quality of writing or precision. I just wanted to see if they would cheat AI detectors, and the answer is: probably not, but Stealthwriter could help.
Finally, consider that there is a plot AI detection tools there, which means that the variability of scores (even with Stealthwriter) is a concern: you cannot always know which detection tool that someone uses. If they use a detector that I have not used here and it is better to detect what Stealthwriter does, for example, you will always be nailed. If you worry about the detection of your text generated by AI, your best bet remains to be done yourself, or at least to revise the text generated by AI very well.



