The Sequence Radar #747: Last Week in AI: OpenAI Eyes Wall Street, MiniMax Opens Up, and Vertical AI Goes Deep | By The Digital Insider

Potential IPOs, large round and new models.

Created Using GPT-5

Next Week in The Sequence:

We are starting a new, amazing series about synthetic data generation. AI of the week dives into MiniMax-M2. In the opinion section we dive into some of the challenges and paradoxes with AI evaluations.

Subscribe Now to Not Miss Anything:

TheSequence is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

📝 Editorial: Last Week in AI: OpenAI Eyes Wall Street, MiniMax Opens Up, and Vertical AI Goes Deep

This week in AI captured the sector’s accelerating evolution across capital markets, open-weight innovation, and verticalized products. Five stories shaped the conversation: OpenAI’s potential IPO, MiniMax’s release of M2, Harvey’s new capital raise, Mercor’s valuation surge, and LayerLens’s public platform launch. Together, they highlight an ecosystem where scale, access, and integration are redefining the balance between research and commerce.

OpenAI’s plans for a public listing are gathering momentum. Reports indicate preparations for an IPO that could value the company near a trillion dollars, following a governance restructure that clarified how its nonprofit foundation retains control while investors gain economic rights. Sam Altman framed going public as the most likely route to meet the firm’s vast capital demands for frontier model development. The move signals how model training has evolved into an industrial-scale enterprise—AI as macroeconomics rather than mere technology.

China’s MiniMax released M2, a mixture-of-experts model with 230 billion total parameters (10 billion active per inference) under an MIT license. M2’s design favors reliability and coding performance over experimental architectures, achieving results comparable to leading proprietary systems at a fraction of the cost. The decision to open-weight such a capable model—optimized for agentic and reasoning tasks—marks a new stage in open AI: production-grade accessibility rather than research demos. MiniMax is positioning itself as a viable counterweight to closed labs, emphasizing efficiency, affordability, and developer control.

Legal AI platform Harvey raised $150 million at an $8 billion valuation, underscoring how defensibility in AI now lies in deep domain integration. Harvey’s success is built on embedding AI throughout the legal workflow—drafting, review, and compliance—rather than relying solely on foundation model differentiation. The vertical model shows that workflow intimacy and regulatory trust are becoming the new moats in enterprise AI.

Meanwhile, Mercor quintupled its valuation to $10 billion in a $350 million Series C. The company’s network-driven approach to AI engineering labor—matching vetted developers to autonomous and enterprise AI projects—has resonated with investors betting on the “AI economy’s labor layer.” Mercor’s growth illustrates the emergence of meta-infrastructure: not models or GPUs, but the human and agentic coordination that builds on top of them.

In a more personal development, LayerLens—a company I co-founded—released a free, public version of its AI benchmarking and evaluation platform. The release includes hundreds of models and benchmarks, allowing users to compare systems side by side, access prompt-by-prompt evaluation results, and explore detailed analytics of model behavior. By making high-quality evaluation tooling open and transparent, LayerLens aims to push the industry toward empirical rigor and reproducibility—the foundation of trustworthy AI progress.

These developments share a common thread: the center of gravity in AI is shifting from research breakthroughs to system-level integration—where compute, capital, and coordination converge into enduring advantage.

🔎 AI Research

gpt-oss-safeguard

AI Lab: OpenAI

Summary: OpenAI announces gpt-oss-safeguard, a pair of open-weight safety reasoning models that interpret developer-provided policies at inference time—classifying messages, completions, and full chats with reviewable chain-of-thought to explain decisions. Released in research preview in 120B and 20B variants (Apache-2.0), they’re fine-tuned from GPT-OSS and evaluated in a companion technical report on baseline safety performance. (openai.com)

Tongyi DeepResearch Technical Report

AI Lab: Tongyi Lab, Alibaba Group

Summary: Tongyi DeepResearch describes a 30B-parameter open-source agentic model combining mid-training and post-training with fully synthetic data and reinforcement learning for deep information-seeking tasks. It achieves state-of-the-art performance on benchmarks like Humanity’s Last Exam, BrowseComp, and WebWalkerQA, establishing a scalable paradigm for autonomous research agents.

Emergent Introspective Awareness in Large Language Models

AI Lab: Anthropic

Summary: The paper tests whether LLMs can genuinely report on their internal states by injecting concept representations into activations (“concept injection”) and observing whether models notice, identify, and distinguish those internal “thoughts” from ordinary inputs; stronger models like Claude Opus 4/4.1 often succeed, including detecting artificial prefills as unintended and exerting limited intentional control over internal states. Overall, it argues that today’s models show functional but unreliable and context-dependent introspective awareness that may grow with capability and post-training strategy.

Beyond Reasoning Gains: Mitigating General Capabilities Forgetting in Large Reasoning Models

AI Lab: Meta Superintelligence Labs

Summary: This paper identifies that reinforcement learning with verifiable rewards (RLVR), while improving reasoning skills, causes models to forget general capabilities such as perception and factual grounding. It introduces RECAP, a replay-based continual learning strategy that dynamically reweights objectives to preserve broad competencies without sacrificing reasoning performance.

AgentFold: Long-Horizon Web Agents with Proactive Context Management

AI Lab: Tongyi Lab, Alibaba Group

Summary: AgentFold proposes a proactive “context folding” mechanism enabling LLM-based web agents to compress and restructure memory dynamically during long-horizon tasks. Trained via supervised fine-tuning on Qwen3-30B, it outperforms larger models by maintaining concise, focused reasoning histories that prevent context saturation and information loss.

Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-Tuning of LLM Agents

AI Lab: Carnegie Mellon University, The Ohio State University & University of Hong Kong collaboration

Summary: This work introduces the Agent Data Protocol (ADP), a standardized schema that unifies heterogeneous agent datasets across domains such as coding, browsing, and API use. By converting 13 datasets into a shared representation and fine-tuning multiple frameworks, ADP achieves up to 20% performance gains and enables reproducible, scalable agent training.

🤖 AI Tech Releases

MiniMax-M2

MiniMax released its M2 model optimized for agentic worklflows.

Granite 4.0

IBM released a new set of its Granite models based on its mamba/transformer architecture.

Cursor 2.0

AnySphere released a new version of Cursor and its first prpietary model.

LayerLens Atlas

LayerLens ( I am a co-founder) released a public version of its AI benchmarking platform with hundreds of models and evals.

LFM2-ColBERT

Liquid AI open sourced its LFM2-ColBERT-350M model optimized for multilingual retrieval.

📡AI Radar



Published on The Digital Insider at https://is.gd/MOpUD9.

Comments