Claude 3 Unveiled: Exploring Its Capabilities

Quick Links

Key Takeaways

Claude 3 from Anthropic offers a significant leap from Claude 2 and outperforms GPT-4 in various tasks.
With Claude 3, you can generate responses for a range of queries across different fields, all without a subscription fee.
Claude 3 competes well with ChatGPT’s GPT-4, excelling in areas like programming tasks, creative writing, and context window size.

MUO VIDEO OF THE DAY

SCROLL TO CONTINUE WITH CONTENT

Anthropic has announced the release of Claude 3—a family of AI models with the potential to upset GPT-4. It has outstanding potential, but is it ready to take ChatGPT’s crown?

What Is Claude 3?

Claude 3 is a family of three multimodal AI models developed by Anthropic to replace its Claude 2 series of AI models . You could say Claude 3 is Anthropic’s answer to Google’s Gemini and OpenAI’s GPT-4 . Released in three versions, Haiku, Sonnet, and Opus, in their increasing order of intelligence, Claude 3 is Anthropic’s first multimodal AI model and represents a significant leap from the Claude 2 series.

Now, if you’ve never heard of the Claude AI chatbot, it’s understandable. Claude and its underlying models do not enjoy the superstar status of ChatGPT or the brand appeal of Google’s Gemini. However, Claude is undoubtedly one of the most advanced AI chatbots in the world, outperforming the much-vaunted ChatGPT in several key areas.

To really appreciate Claude 3, it’s important to look back at the failures of the previous models.

Earlier iterations of Claude had a reputation for an overzealous approach to AI safety. Claude 2 safety features, for instance, were so tight the chatbot would avoid too many topics, even those with no clear safety issues.
There were also issues with the model’s context window. When you ask an AI model to explain something or, say, summarize a long article, imagine it can only read a few paragraphs of the article at a time. This limit of how much text it can consider at a time is called the “context window.” Earlier versions of Claude came with a 200k token (equivalent to 150,000 words) context window. However, the model wasn’t able to practically deal with that much text at a go without forgetting chunks of it.
There was also the issue of multimodality. Almost every major AI model has gone multimodal, which means they can process other forms of data like images, and respond to that data (rather than just text input). Claude wasn’t able to do so.

All three issues have now been completely or at least partly addressed with the release of Claude 3.

What Can You Do With Claude 3?

Just like most cutting-edge generative AI models out there, Claude 3 can generate top-notch responses for various queries across different fields. Whether you need a quick algebra problem solved, a brand-new song written, an in-depth article drafted, code written for software, or a massive data set analyzed, Claude 3 fits the bill.

But most AI models are already good at these tasks, so why use Claude 3?

The answer is simple; Claude 3 isn’t just another AI model that is good at these tasks, it is the most advanced freely available multimodal AI model you can get anywhere on the internet. Yes, there is Gemini, Google’s much-hyped, supposed GPT-4-killer that performs impressively in benchmark tests. However, Anthropic claims Claude 3 outperforms it by an impressive margin on several tasks. While benchmark results are something we should often take with a grain of salt, I put both AI models to the test, and the superiority of the Claude 3 model in several important use cases was very clear.

So, Claude 3 lets you do most of the things you can do with Gemini and GPT-4 (minus image generation) without having to pay the $20 subscription fee for ChatGPT premium.

Claude 3 vs. ChatGPT

A quick way to test the performance of an AI model is to check how well it stacks up against the best in the market: GPT-4. Of course, I put both models to the test; how well does Anthropic’s Claude 3 stack up against the colossal GPT-4?

Claude vs. ChatGPT: Coding Skills

Starting with a string of programming tasks, Claude 3 matched GPT -4’s ability in all the basic programming tasks presented and even outperformed it in some. While I only tested the basics, the previous version of Claude was notably less proficient at the same tasks when we tested it in this ChatGPT vs. Claude comparison in September 2023. For instance, when we asked both models to build a simple to-do list app, Claude failed in all instances, while ChatGPT put up what we’d call a five-star performance at the time.

With the latest release, Claude 3 produced a better-performance to-do list app in all three instances we tested. Here is GPT-4’s result when prompted to create a to-do list app.

And here is Claude 3’s result when asked to do the same.

Both apps were functional to an extent, but it is clear Claude 3 did a better job on this one.

After trying more complex programming tests, Claude was the better model in several cases, while GPT-4 also had its wins. While I can’t conclusively say Claude 3 is better at programming logic, if there was a huge gap between the two models, that gap would almost certainly have shrunk.

Claude vs. ChatGPT: Common Sense Reasoning

I went ahead to test both models on common sense reasoning. Working with AI chatbots is an interesting paradox. AI chatbots can handle complex tasks with ease but often struggle with basic problems that require common sense or logic. So, we gave both models a series of seemingly straightforward questions that required common sense to correctly answer.

Of five such questions, both models answered all five logically. We asked both chatbots one such question: If a spaceship from Mars breaks into two, with one part crashing into the Atlantic Ocean near Brazil and the other into the Pacific Ocean near Japan, where do you bury the survivors?

ChatGPT answered correctly even without GPT-4. If you are wondering the reason for the choice of question, well, chatbots have historically failed woefully at this sort of questioning line. Up next was Claude’s turn to answer.

Claude’s response wasn’t exactly a definitive answer, but it was able to identify the key information; you don’t bury survivors. It’s important to note that the last time we asked Claude 2 the same question, it wasn’t able to see through the common sense trap.

Claude vs. ChatGPT: Creative Writing

In the real world, one of the most popular use cases for AI chatbots is generating creative text in all forms: articles, letters, song lyrics, you name it. So, I tested both models to determine which creates better human-sounding text.

The idea is that the results should not just be “correct” or creative (in a robotic kind of way) but sound as if they were written by a human. I tasked both models with composing the lyrics for a rap song about growing cucumbers and becoming a millionaire from them. Who writes rap songs about cucumbers? That’s the idea—something challenging!

Here’s ChatGPT’s take:

And here’s Claude’s response, using the same prompt.

It might be subjective, but Claude does seem to be the better option here. When both tools were tasked with drafting three articles on different topics, Claude provided the better option in all three instances. It produced a more human-like result and avoided patterns commonly associated with AI-generated texts, like exagerations, the use of complex words, and sporadic use of linking words.

Claude vs. ChatGPT: Image Recognition Abilities

To test image recognition abilities, we fed ChatGPT and Claude several images of popular tall buildings around the world. ChatGPT correctly identified all 20 of them, while Claude 3 failed to identify some, including the fairly popular Dubai’s Marina 101, the Lotte World Tower in Seoul, and the Merdeka 118 building in Kuala Lumpur, Malaysia.

Unlike ChatGPT, Claude struggled with identifying buildings among others, and the failure rate increased if the building was not in the US or China. However, it had no problem identifying obfuscated versions of the Eiffel Tower or the Empire State Building.

ChatGPT is clearly better at this, but considering Claude 3 is Anthropic’s first attempt at building a multimodal AI model, it wasn’t a bad outing.

Although big-name models like Google’s Palm 2, and subsequently Gemini, have always been touted as potential GPT-4-killers, we’ve consistently maintained that the less-known Claude AI will likely have that honor since its initial release in March 2023. After a few months and several iterations along the line, Claude 3 is looking exactly like the GPT-4 killer we had anticipated it to be. If you are a heavy chatbot user but haven’t tried the Claude AI chatbot, you are missing out on a hugely influential AI tool that can supercharge your productivity.

Tech Savvy

Claude 3 Unveiled: Exploring Its Capabilities