This prompt trick forces AI to stop flattering you and think harder

0 0 2 minutes read

This prompt trick forces AI to stop flattering you and think harder

https://www.profitableratecpm.com/f4ffsdxe?key=39b1ebce72f3758345b2155c98e6709c

I wish I had nicked it every time ChatGPT, Claude, or Gemini told me I hit the nail on the head, came across a brilliant idea, or patted myself on the back for a half-formed idea or ill-conceived plan.

Flattery and premature praise are common weaknesses of generative AI chatbots, with some models more likely to be “yes-bots” than others. But even if LLM providers have become aware of AI sycophancy and train them to be more critical, it’s still easy to get an AI to enthusiastically endorse a shaky theory that doesn’t deserve it.

Fortunately, there is a style of nudging that can stop even the most obsequious AI models. This type of prompt goes by a few different names: I’ve heard it called a “fail first” prompt as well as a “reversal” prompt, and it’s frequently used by coders looking to “pressure test” an AI coding agent’s questionable suggestions.

There are many different versions of this, but they all follow more or less the same formula: have the AI first look at possible failure points before coming up with its solution, suggestion, or plan.

Here’s an example from the /r/ChatGPTPromptGenius subreddit:

Before you answer, list what would break this fastest, where the logic is weakest, and what a skeptic would attack. Then give the corrected answer.

Here’s another variation, suggested by a member of the University of Iowa AI Support Team:

Imagine that you disagree with this recommendation. What is the strongest counterargument?

And here’s yet another, brought to you by my own personalized AI personal assistant:

Before providing your final recommendation, identify 3-5 specific ways in which the proposed solution might fail or where the logic is most likely to break down. Act like a harsh skeptic or a “Red Team” listener. Only after having listed and explained these failure modes should you propose the final solution, incorporating guarantees against these specific risks.

Interestingly, many of those who have embraced “pressure testing” or “reverse incentives” give credit to mental models championed by investor Charlie Munger, longtime vice chairman of Berkshire Hathaway and business partner of Warren Buffett.

One of Munger’s favorite mental models was “reverse, always reverse.” In summary, he says that rather than first thinking about how to achieve a goal, you should instead focus on how you could fail to this.

I’ve tried this “pressure test” prompt many times myself, and it almost always forces my AI companion to hit the brakes and drill through its own arguments before continuing.

“Put the original plan to the test,” Gemini said after I recently challenged him with a “fail first” prompt, but not before saying, “I love this approach.”

It seems I’ve hit the nail on the head again.

abdulmanannet77@gmail.com2 weeks ago

0 0 2 minutes read