3 vs. 3: The Open vs. Closed Battle for Big AI | By The Digital Insider

Big model announcements by Meta, Mistral and xAI.

Created Using DALL-E

Next Week in The Sequence:

  • Edge 417: We are getting to the end of our series about autonomous agents with a review of multi-agent systems. We dive into Alibaba Research’s AgenScope and LangChain’s LangGraph framework.

  • Edge 418: We take a second look at the new version of DSPy that is rapidly becoming one of the most important frameworks for building LLM apps in the market.

You can subscribe to The Sequence below:

TheSequence is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

📝 Editorial: 3 vs. 3: The Open vs. Closed Battle for Big AI

When the open vs. closed weight model debate started a couple of years ago, many thought it was going to be a battle between OpenAI, Anthropic, and Google on one side, and hundreds of open-source models on the other. Reality turned out to be quite different. The open-source space for massively large foundation models has been reduced to three key players: Meta, Mistral, and xAI. This shouldn’t come as a surprise if we consider that training a multi-hundred-parameter model surpasses $100 million in training costs. Open sourcing that kind of investment is something only a few companies can afford.

So GPT-x, Claude, and Gemini versus Llama, Mistral, and Grok.

How will this shape up? When the first versions of these open-weight models came out, they were a couple of iterations behind the quality of the large commercial models. That’s no longer the case, and this week was a good reminder of how competitive the big open-source models can be.

  • Meta open-sourced Llama 3.1, a 402-billion parameter model that scored at GPT-4o levels in nearly every relevant benchmark. In a recent interview with Bloomberg, Mark Zuckerberg mentioned that Llama 4 should completely close the gap with the state-of-the-art AI.

  • Mistral unveiled Mistral Large, a 123-billion parameter model with a 128k token window.

  • Elon Musk announced the “most powerful” AI training cluster in the world, consisting of 100,000 GPUs, and also mentioned that they are training the most powerful AI by any metric.

Big AI is a game for big budgets, and there might still be room for a few more competitors in this race (maybe Illya’s new company). However, the space is not going to change drastically. It’s OpenAI, Google, and Anthropic vs. Meta, Llama, and xAI. One thing is for certain: open-source big AI is going to be competitive.

🔎 ML Research

AlphaProof and AlphaGeometry 2

Google DeepMind published details about AlphaProof and AlphaGeometry 2, two systems that combines to achieve silver medalist status in this year’s International Mathematical Olympiad (IMO). AlphaProof is a reinforcement learning model for math reasoning while AlphaGeometry uses a neurosymbolic architecture that combines LLMs and symbolic models —> Read more.

The Llama 3 Herd of Models

Meta AI published paper detailing the architecture and processes for building the Llama 3 family of models. The paper also introduces a compositional approach to integrates image, video and speech recognition capabilities into Llama 3 —> Read more.

OpenDevin

Researchers from elite AI universities such as UC Berkeley, Yale, Carnegie Mellon and others published a paper introducing OpenDevin, a framework for developing AI agents that interact with environments similar to human programmers. OpenDevin agents are able to collaborate with human programmers in different tasks such as bug fixing, feature building, testing and many others —> Read more.

Model Collapse

Researchers from Oxford, Cambridge, Imperial Collegue of London and other institutions published a paper in Nature outlining a curious phenomenon in LLMs coined as model collapse. The thesis of model collapse states that LLMs will start showing irreversible degenerative behavior when trained in data created by other AI models —> Read more.

Visual Haystacks Benchmark

Berkeley AI Research(BAIR) published a paper introducing the Visual Haystacks Benchmark(VHS) for multi-image reasoning. VHS evalautes retrieval and reasoning capabilities across large collections of uncorrelated images —> Read more.

Pruning and Distillation in LLMs

NVIDIA Research published a paper proposing a set of effective compression best practices to build compact LLMs. The techiques combine the best strategies for depth, width, attention and MLP pruning with knowledge distillation-based retraining —> Read more.

SlowFast-LLaVA

Apple Research published a paper detailing SlowFast-LLaVA(SF-LLaVA), a video language model optimized for capturing the spatial semantics and temporal context in videos. SF-LLaVA uses a two-stream input design to aggregate features from different video frames in ways that facilitate knowledge extraction —> Read more.

🤖 AI Tech Releases

Llama 3.1

Meta open sourced Llama 3.1 including its 405B parameter model as well as complementary tools and applications —> Read more.

Mistral Large

Mistral unveiled Mistral Large, a 123B parameter model that rivals Llama 3.1 —> Read more.

SearchGPT

OpenAI unveiled a preview of a new AI-first search engine —> Read more.

NVIDIA AI Foundry

NVIDIA announced the availability of its AI Foundry to enable the creation of custom models for enterprises —> Read more.

Phi-3 Serverless Fine-Tuning

Microsoft unveiled new AI features in the Azure platform including a serverless infrastructure to fine-tune Phi-3 models —> Read more.

Stable Video 4D

Stability AI announced the release of Stable Video 4D, its latest video generation model —> Read more.

🛠 Real World AI

Orchestration at Netflix

Netflix open sourced Maestro, its engine for orchestration of data and ML pipelines —> Read more.

Product Categorization at Walmart

Walmart Global Tech discussed some of their work behind Ghotok, their predictive generative AI engine used for product categorization —> Read more.

📡AI Radar

TheSequence is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.


#000, #Agent, #Agents, #Ai, #AIAGENTS, #AILegal, #AIModels, #AIResearch, #AiSecurity, #AiTraining, #AIVideo, #Alibaba, #Alphageometry, #Announcements, #Anthropic, #Applications, #Approach, #Apps, #Architecture, #Art, #Attention, #Autonomous, #AutonomousAgents, #Azure, #B2B, #Behavior, #Benchmark, #Billion, #Budgets, #Bug, #Building, #Change, #Claude, #Clio, #Cloud, #CloudSecurity, #Cluster, #Cohere, #Collaborate, #Collections, #Companies, #Compliance, #Compression, #Data, #DeepMind, #Design, #Details, #Driving, #DSPy, #Edge, #Editorial, #ElonMusk, #Engine, #Enterprises, #Features, #Foundation, #Framework, #Funding, #Game, #Gap, #Gemini, #GenAi, #Generative, #GenerativeAi, #Generator, #Global, #Google, #GPT, #Gpt4O, #GPUs, #Grok, #How, #Human, #Images, #Infrastructure, #INterview, #Investment, #It, #Langchain, #Language, #LanguageModel, #Learning, #Legal, #LegalTech, #Llama, #Llama3, #Llama31, #Llm, #LLMs, #London, #Loops, #MarkZuckerberg, #Math, #Mathematical, #Meta, #Mistral, #MistralLarge, #Ml, #Model, #ModelCollapse, #Models, #Musk, #Nature, #Netflix, #Nvidia, #Olympiad, #One, #Openai, #Orchestration, #Other, #PAID, #Paper, #Parameter, #PHI, #Phi3, #Pipelines, #Platform, #Read, #ReinforcementLearning, #Research, #Review, #Robotics, #SakanaAI, #Sales, #Search, #SearchEngine, #Security, #SecurityPlatform, #SelfDriving, #Semantics, #Serverless, #Space, #SpeechRecognition, #Startup, #Stealth, #StealthMode, #Tech, #Testing, #Tools, #Training, #Uc, #Universities, #Version, #Versus, #Video, #VideoGeneration, #Videos, #Vs, #Vulnerabilities, #Work, #X, #XAI, #Zuckerberg
Published on The Digital Insider at https://is.gd/GauKsj.

Comments