By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
newtechzones.comnewtechzones.comnewtechzones.com
  • Home
  • Tech News
  • Gadget Reviews
  • Smartphone
  • Computer & Laptop
  • Buying Guide
Reading: xAI Launches Grok 3: How Does It Stack Up Against GPT-4o and Claude?
Share
Font ResizerAa
Font ResizerAa
newtechzones.comnewtechzones.com
Search
  • Home
  • Tech News
  • Gadget Reviews
  • Computer & Laptop
  • Buying Guide
  • Smartphone
© 2026 NewTechZones.com. All Rights Reserved.
Tech News

xAI Launches Grok 3: How Does It Stack Up Against GPT-4o and Claude?

admin
Last updated: May 4, 2026 2:49 pm
admin
Share
SHARE

Introduction

xAI’s AI initiative, Grok, released version 3 in a way that generated both excitement and skepticism. The AI project promises to beat GPT-4o in coding, with Grok 3 integrated directly into the X (formerly Twitter) app for news analysis in real-time. We tested Grok 3 on various metrics to give you a realistic take on the capabilities of this new language model.

Contents
  • Introduction
  • Architecture and Training
  • Benchmark Results
  • Pricing and Access
  • Content Policies and Privacy
  • Conclusion

The context in which Grok 3 comes out makes a difference. While xAI started working on developing AI technology in 2023, it set itself an ambitious goal of producing maximally curious AI with less restrictive guidelines than Musk claims its competitors do. Grok 1 and Grok 2 were credible models but did not stand out as class-leading products – Grok 3 seems to mark the point in xAI’s development when the company has invested enough computational resources in its projects to be truly competitive.

But whether it passes the benchmark is debatable because there are many different applications in which Grok 3 might succeed better than others, depending on the metric you are using to compare.

Architecture and Training

xAI has been notoriously vague about its model’s architecture, although we can make a good estimate based on benchmark results. Grok 3 seems to be a mixture-of-expert (MoE) model trained on a large data set made up of X post data, web crawl information, and code repositories. The unique feature of the model is its ability to process tweets that are posted in real time – something that other commercial models don’t boast.

MoE architectures activate only a portion of the model parameters for each request rather than using the whole model for processing a request. That means the model has a high parameter count but uses much lower inference compute costs per request, allowing for massive scale. Gemini, Llama 3.1, and all versions of Mistral’s Mixtral series employ MoE architecture in their models. MoE models are more energy-efficient for inference, which may account for competitive pricing.

Grok 3 has a massive context window of 1 million tokens, putting it in the same league as Gemini Ultra 2. As xAI explains, the model employs a novel variant of the attention mechanism, called ‘radial attention’, which optimizes the quadratic scaling of token-by-token attention by organizing the computation in concentric circles that favor nearby tokens in the sequence. Whether radial attention really delivers those benefits cannot be independently confirmed.

Benchmark Results

Grok 3 is not superior to Claude 3.5 Sonnet on any of the benchmark metrics we tried it on except for HumanEval. On HumanEval, which assesses coding ability, Grok 3 has scored 85.7% pass rate, while GPT-4o and Claude 3.5 Sonnet had 90.2% and 92% respectively – so the ‘beats GPT-4o in coding’ claim was only partially true.

On the math and science reasoning test, MATH, the model is competitive but inferior to its competition – Grok 3 got 74.3%, while Gemini Ultra 2 scored 83.4% and GPT-4o had yet higher marks. The area in which Grok 3 stands out is real-time fact retrieval and social media trend analysis – unsurprisingly, since its training data includes X post data.

The model has produced very accurate summaries of discussions on stocks and political events in the last 24 hours – a test on which GPT-4o, Gemini Ultra 2, and Claude failed. They provided outdated or partial information about recent events because they lacked access to new data and had to use web search to obtain information. Web search cannot be performed in real-time and reliably as it is performed by the model.

On instruction-following tasks – tests aimed at assessing whether the model complies with multi-step instructions with certain formatting guidelines, Grok 3 significantly trails GPT-4o and Claude. That translates into problems following specific guidelines in real-life requests, as well as occasionally editorializing where neutrality would be expected.

Pricing and Access

Grok 3 can be accessed by X Premium+ users – the service costs $16/month. At the same time, the subscription gives access to the full premium range of X services, which can increase your spending even more – for example, by enabling you to monetize your posts. Since Premium features can justify the purchase of X premium package for independent reasons, the marginal price of Grok 3 for such users is effectively zero.

For API developers, Grok 3 costs $6/million input tokens and $18/million output tokens. Those prices are a bit lower than the standard OpenAI price of $12 and $18 respectively for GPT-4o – thus, in terms of costs, the model provides a little added value to developers.

An important limitation: since the model’s web interface is integrated with X, users without X accounts face a very complicated onboarding procedure. If you fall into that category, the more feasible option is to try API access through the xAI developer portal. No X account is needed, only a credit card – thus, the model is quite easily accessible by developers.

Content Policies and Privacy

Historically, Grok has been configured to have less content restriction filters than other major language models like GPT-4o or Claude. While Musk likes to sell this feature as ‘less censored AI,’ in practice it means that Grok 3 will be engaged on more topics than other models, but also produces less accurate answers in areas that require cautious responses due to sensitive nature – for example, medicine or law. For instance, Grok 3 will provide unqualified statements in cases where other models hedge.

Privacy advocates have also pointed out that xAI’s data practices are not very transparent – the data usage policy is less detailed than that of Anthropic or OpenAI, and the questions of whether deleted X posts are included in training data remain unanswered. For enterprise clients with sensitive information processed through the model, that ambiguity presents some risk of compliance violations.

The fact that the infrastructure, training data, and platform of delivery (X) are concentrated in a single company raises issues of concentration of power in AI development. While it does not affect the quality of the model itself, it can be a concern for enterprises weighing the risks of long-term deployment of the product.

Conclusion

Overall, Grok 3 is a competent language model with a real edge in accessing live data – an important advantage for those who work with real-time information. It’s definitely not the leader in coding and other benchmark metrics, but for users entrenched in the X ecosystem, it can be an excellent fit.

From the perspective of development, xAI has made substantial progress from Grok 1 to Grok 3 in terms of computational power invested into developing language models. The next step – Grok 4 – will be telling as to whether xAI manages to match competition in all benchmark metrics, including instruction-following and reasoning.

Quantum Computing Breakthrough: IBM Hits 1,000 Qubit Milestone
The Rise of Smart Devices and How Connected Technology Is Transforming Daily Life
Quantum Computing Breakthrough: IBM Hits 500 Qubit Milestone
Meta Ray-Ban Smart Glasses 3 Review: The First Wearable AI That Actually Works
Microsoft Copilot+ PCs in 2026: Are They Actually Worth the Premium?
TAGGED:AIElon MuskGrokTech NewsxAI

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

    By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
    Share
    Previous Article Google Nest Thermostat Pro Review: The Smartest Home Climate Control Yet
    Next Article 5G Advanced Is Here: What It Means for Your Mobile Experience
    Leave a Comment

    Leave a Reply Cancel reply

    Your email address will not be published. Required fields are marked *

    Stay Connected

    XFollow
    PinterestPin
    InstagramFollow
    
							banner							
							banner
    Share Your Tech Knowledge With the World
    Become a contributor at NewTechZones and share your knowledge with a global tech audience. Submit your blog about gadgets, AI, smartphones, software, and emerging technologies, and get your work published on our platform.
    Contact Us

    Latest News

    Gaming Mouse Buying Guide: Important Features Gamers Should Consider
    Buying Guide
    Bluetooth Speaker Buying Guide: How to Choose the Best Portable Speaker
    Buying Guide
    Portable Power Bank Buying Guide: How to Choose the Right Power Bank for Your Devices
    Buying Guide
    Wireless Earbuds Buying Guide: How to Choose the Best True Wireless Earbuds
    Buying Guide
    Smart TV Buying Guide: How to Choose the Perfect Television for Your Home
    Buying Guide

    You Might also Like

    Tech News

    How 5G Technology Is Accelerating the Next Phase of Global Digital Transformation

    admin
    admin
    4 Min Read
    Tech News

    5G Advanced Is Here: What It Means for Your Mobile Experience

    admin
    admin
    8 Min Read
    Tech News

    Quantum Computing Breakthroughs Are Bringing the Next Generation of Technology Closer to Reality

    admin
    admin
    4 Min Read

    NewTechZones shares the latest news, guides, and insights on software, gadgets, AI, and emerging technology.

    X-twitter Pinterest Instagram

    Quick Link

    • Home
    • Contact
    • Disclaimer
    • Privacy Policy
    • Terms & Conditions

    Category

    • Tech News
    • Gadget Reviews
    • Smartphone
    • Computer & Laptop
    • Buying Guide

    Subscribe to Our Newsletter

      © 2026 NewTechZones.com. All Rights Reserved.

      About

      //

      NewTechZones is a technology platform sharing the latest news, guides, and insights on software, gadgets, artificial intelligence, and emerging digital innovations.

      Quick Link

      • Home
      • Contact
      • Disclaimer
      • Terms & Conditions

      Category

      • Gadget Reviews
      • Tech News
      • Computer & Laptop
      • Smartphone
      • Buying Guide

      Sign Up for Our Newsletter

        newtechzones.comnewtechzones.com
        Follow US
        © 2026 NewTechZones.com. All Rights Reserved.
        Join Us!
        Subscribe to our newsletter and never miss our latest news, podcasts etc..

          Welcome Back!

          Sign in to your account

          Username or Email Address
          Password

          Lost your password?