Google’s Bard Just Beat ChatGPT's GPT-4 in Rankings

Bard's recent update to Gemini Pro propels it past GPT-4 and Claude, marking a significant shift in the chatbot landscape

Ben Wodecki, Jr. Editor

January 31, 2024

2 Min Read
Bard logo
Getty Images, AI Business

At a Glance

  • Google's Bard surpasses OpenAI's GPT-4 in the LMSYS Chatbot Arena Leaderboard with its new Gemini Pro version.

Google Bard just surpassed GPT-4 to become the second highest-scoring chatbot on the LMSYS Leaderboard, loosening the grip of OpenAI’s top models in the chatbot space.

It overtook GPT-4 and is closing in on GPT-4 Turbo, which retains its crown. Both GPT-4 Turbo and GPT-4 have held a vice-like grip on the top two spots respectively for some time. Bard's surge is due to being updated with Google's new Gemini Pro large multimodal model.

The Chatbot Arena Leaderboard was created by LMSYS Org, which stands for Large Model Systems Organization, an open research group founded by the University of California, Berkeley in partnership with the University of California, San Diego and Carnegie Mellon University.

LMSYS, which built the Vicuna LLM, described Bard’s leap up the leaderboard as a “remarkable achievement.”

The Chatbot Arena is a benchmark platform for large language models that features "anonymous, randomized battles in a crowdsourced manner." The rankings are based on the Elo rating system, which is widely used in chess and other competitive games.

The Gemini Pro-powered Bard is only the second model on the board to achieve a score over 1200.

Bard’s rise comes as Google updates the underlying models powering the chatbot. Out is PaLM 2 and in comes Gemini, Google's most powerful model to date. It unveiled Gemini last December, launching the initial Pro version for Bard, and expects to release the mammoth version, Gemini Ultra, soon.

Related:Google Gemini Pro is Coming to Businesses and Developers

Beats Claude, too

Bard also beat all versions of Claude, with the Gemini Pro Dev API version ranking higher than Anthropic’s Claude 2.1 and GPT 3.5 Turbo.

“The race is heating up like never before! Super excited to see what's next for Bard + Gemini Ultra release," according to LMSYS.

The rise up the score board is a welcome reprieve for Google. After a shaky start, Bard received routine updates with integrations now spanning other Google apps such as YouTube and Docs.

Recently, Reddit users or Redditors told Google they wanted Bard to be more like ChatGPT, after a Google product manager asked for their wish list. Users requested dedicated mobile apps, custom instructions and image generation, with some of those suggestions already in the works.

OpenAI’s GPT-4 has routinely topped model leaderboards. It firmly holds first place on Stanford’s HELM Leaderboard, with GPT-4 Turbo in second. PaLM 2, which previously powered Bard, did not do as well, being pipped by Palmyra X V3 from AI startup Writer as the highest-scoring non-OpenAI model on the HELM leaderboard.

Related:Users to Google: Make Bard More Like ChatGPT

Read more about:

ChatGPT / Generative AI

About the Author(s)

Ben Wodecki

Jr. Editor

Ben Wodecki is the Jr. Editor of AI Business, covering a wide range of AI content. Ben joined the team in March 2021 as assistant editor and was promoted to Jr. Editor. He has written for The New Statesman, Intellectual Property Magazine, and The Telegraph India, among others. He holds an MSc in Digital Journalism from Middlesex University.

Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!

You May Also Like