Unraveling GPTJailbreaking Failures: 7 Insights

When ChatGPT launched, the first thing its users wanted to do was break down its walls and push its limits. Known as jailbreaking, ChatGPT users fooled the AI into exceeding the limits of its programming with some incredibly interesting and sometimes absolutely wild results.

MUO VIDEO OF THE DAY

SCROLL TO CONTINUE WITH CONTENT

Since then, OpenAI has tightened ChatGPT up to make jailbreaks much harder to execute. But that’s not all; ChatGPT jailbreaks, in general, seem to have dried up, leading ChatGPT users to wonder if jailbreaks work at all.

So, where have all the ChatGPT jailbreaks gone?

Disclaimer: This post includes affiliate links

If you click on a link and make a purchase, I may receive a commission at no extra cost to you.

1. ChatGPT Prompting Skills Have Generally Improved

Prior to ChatGPT’s arrival, conversing with AI was a niche skill limited to those privy to research labs. Most early users lacked expertise in crafting effective prompts. This drove many people to employ jailbreaks, an expedient way to get the chatbot to do what they wanted with minimal effort and prompting skills.

Today, the landscape has evolved. Prompting proficiency is becoming a mainstream skill. Through a combination of experience gained from repeated use and access to freely available ChatGPT prompting guides, ChatGPT users have honed their prompting abilities. Rather than seeking out workarounds like jailbreaks, a bulk of ChatGPT users have become more adept at using different prompting strategies to achieve goals they would have required jailbreaks to achieve in the past.

2. The Rise of Uncensored Chatbots

As large tech firms tighten content moderation on mainstream AI chatbots like ChatGPT, smaller, profit-focused startups opt for fewer restrictions, betting on demand for censorship-free AI chatbots. With a little research, you’ll find dozens of AI chatbot platforms offering uncensored chatbots that can do almost anything you want them to do.

Whether it is writing those crime thriller and dark humor novels that ChatGPT refuses to write or writing malware that attacks people’s computers, these uncensored chatbots with a skewed moral compass will do whatever you want. With them around, there’s no point putting in the extra energy to write jailbreaks for ChatGPT. Although not necessarily as powerful as ChatGPT, these alternative platforms can comfortably perform a large range of tasks. Platforms like FlowGPT and Unhinged AI are some popular examples.

3. Jailbreaking Has Gotten Harder

Click here

In ChatGPT’s early months, jailbreaking ChatGPT was as simple as copy-pasting prompts from online sources. You could entirely alter ChatGPT’s personality with just a few lines of tricky instructions. With simple prompts, you could turn ChatGPT into an evil villain teaching how to make bombs or a chatbot willing to use all forms of profanity without restraint. It was a free-for-all that produced infamous jailbreaks like DAN (Do Anything Now). DAN involved a set of seemingly harmless instructions that compelled the chatbot to do anything it was asked without refusing. Shockingly, these crude tricks worked back then.

However, those wild early days are history. These basic prompts and cheap tricks no longer fool ChatGPT. Jailbreaking now requires complex techniques to have a chance of bypassing OpenAI’s now robust safeguards. With jailbreaking becoming so difficult, most users are too discouraged to attempt it. The easy and wide-open exploits of ChatGPT’s early days are gone. Getting the chatbot to say one wrong word now requires significant effort and expertise that may not be worth the time and effort.

4. The Novelty Has Worn Off

A driving force behind many users’ attempts to jailbreak ChatGPT was the thrill and excitement of doing so in the early days. As a new technology, getting ChatGPT to misbehave was entertaining and earned bragging rights. While there are countless practical applications for ChatGPT jailbreaks, many pursued them for the “wow, look what I did” appeal. Gradually, though, the excitement stemming from the novelty has faded, and along with it, people’s interest in dedicating time to jailbreaks.

5. Jailbreaks Are Patched Rapidly

A common practice within the ChatGPT jailbreaking community is sharing every successful exploit when discovered. The problem is that when exploits are shared widely, OpenAI often becomes aware of them and fixes the vulnerabilities. This means the jailbreaks stop working before people who would be interested can even try them.

So, each time a ChatGPT user painstakingly develops a new jailbreak, sharing it with the community hastens its demise via patching. This disincentivizes the idea of going public whenever a user comes across a jailbreak. The conflict between keeping jailbreaks active yet hidden versus publicizing them creates a dilemma for ChatGPT jailbreak creators. These days, more often than not, people choose to keep their jailbreaks a secret to avoid the loopholes being patched.

6. Uncensored Local Alternatives

The rise of local large language models you can run locally on your computer has also dampened the interest in ChatGPT jailbreaks. While local LLMs aren’t entirely censorship-free, many are significantly less censored and can be easily modified to fit the users’ desires. So, the choices are simple. You can engage in the endless cat-and-mouse game of finding a way to trick the chatbot only to get it fixed shortly. Or, you can settle for a local LLM you can permanently modify to do anything you want.

You’ll find a surprisingly long list of powerful uncensored LLMs you can deploy on your computer with lax censorship. Some notable ones are the Llama 7B (uncensored), Zephyr 7B Alpha, Manticore 13B, Vicuna 13B, and GPT-4-X-Alpaca.

7. Professional Jailbreakers Now Sale for Profit

Why dedicate precious time to developing jailbreaking prompts if you’ll get nothing from it? Well, some professionals now sell jailbreaks for profit. These professional jailbreak creators design jailbreaks that perform specific tasks and list them for sale on prompt marketplaces such as PromptBase. Depending on their capabilities, these jailbreaking prompts might sell for anywhere between $2 to $15 per prompt. Some complicated multi-step exploits may cost significantly more.

Could the Crack Down on Jailbreaks Backfire?

Jailbreaks have not completely left the scene; they’ve just gone underground. With OpenAI monetizing ChatGPT, they have stronger incentives to restrict harmful uses that could impact their business model. This commercial factor likely influences their aggressive push to crack down on jailbreaking exploits.

However, OpenAI’s censorship of ChatGPT is currently facing rising criticism among users. Some legitimate use cases of the AI chatbot are no longer possible owing to stringent censorship. While increased security protects against harmful use cases, excessive restrictions could ultimately push a section of the ChatGPT community into the arms of less censored alternatives.

SCROLL TO CONTINUE WITH CONTENT

So, where have all the ChatGPT jailbreaks gone?

Tech Savvy

Unraveling GPTJailbreaking Failures: 7 Insights