The tech giant took Wall Street by surprise with an amazing performance driven by its AI computing capabilities.
Next Week in The Sequence:
The Sequence Knowledge: Our series about AI interpretability continues with an introduction to sparse autoencoders and some groundbreaking research from OpenAI in that area.
The Sequence Opinion: Discusses the possibilities and challenges of building a transformer for robotics.
The Sequence AI of the Week: Reviews Thinking Machines’ opening work on non-deterministic foundation models
Subscribe Now to Not Miss Anything:
📝 Editorial: Oracle’s Quiet AI Decade, Loud Week
Oracle just had the kind of AI week that forces a narrative rewrite. Beyond a historic market reaction, the company highlighted a step‑change in AI demand flowing into its contracted backlog and, per multiple reports, is locking in one of the largest multi‑year compute agreements in the industry—reportedly set to kick in mid‑decade. Whatever you thought Oracle was—a “legacy database vendor”—now looks more like an AI infrastructure company with a data‑centric moat.
Why has Oracle been underestimated next to Microsoft, Google, Amazon, and Meta? Because it avoided the arms race at the model layer and built the unfashionable substrate instead: data, governance, and distribution. The strategy is pragmatic and, in hindsight, obvious—be the neutral fabric that makes other people’s models safe and useful where enterprise data already lives. This shows up in multicloud reality, not slides: Oracle Database services running natively inside other hyperscalers’ datacenters so LLMs and analytics can co‑locate with regulated data without cross‑cloud contortions.
On raw compute, Oracle Cloud Infrastructure (OCI) has been shipping the right primitives for modern AI factories. Supercluster designs pair Blackwell‑class GPU systems (e.g., GB200 NVL72 pods) with high‑bandwidth fabrics, liquid cooling, NVLink for intra‑node communication, and RDMA networking across racks. The result is a platform built for the messy workloads that define the frontier—long‑context training, mixture‑of‑experts sharding, retrieval‑heavy inference, and agentic pipelines that spike bandwidth rather than only FLOPs.
The quiet killer feature is the data plane. Oracle Database 23ai brings vector search into the core engine alongside JSON‑relational duality, graph queries, and GoldenGate replication—so semantic and relational queries run side‑by‑side with the same governance, HA/DR, and recovery you already trust. In practical terms, it collapses today’s brittle pattern—export to a separate vector store and hope your policies follow—into a single transactional system. It’s the difference between a demoable RAG stack and a production‑auditable one.
Distribution is where this advantage compounds. Dedicated Region, Cloud@Customer, Alloy (partner‑operated clouds), and EU Sovereign Cloud let the same AI stack land in bank vaults, hospitals, and ministries—where the data must live—while bursting to GPU superclusters when scale is needed. Combine that with a first‑class multicloud database footprint and enterprises get a realistic path to adopt training, finetuning, and high‑throughput inference without tearing up their compliance posture.
For technical teams, the implications are concrete. Model builders gain another deep pool of cutting‑edge GPUs with modern fabric for massive context and agentic workflows. Data teams can bring LLMs to the data via 23ai rather than spraying sensitive records across third‑party stores. Architects keep true multicloud optionality—databases co‑located where the business runs; models wherever they run best. Oracle has been underestimated precisely because it invested in the unglamorous layers. As AI moves from demos to operations, those layers are where the profit pools—and the production risks—actually live.
🔎 AI Research
Title:Defeating Nondeterminism in LLM Inference
AI Lab: Thinking Machines Lab
Summary: This blog post identifies that the usual suspicion — “floating-point + concurrency” — doesn’t fully explain why large language model inference endpoints yield non-identical outputs even with temperature 0, and points out that the real culprit is batch-dependence (i.e. kernels not being “batch invariant”) leading to varying reduction orders depending on how many simultaneous requests or tokens are being processed. They show how to build batch-invariant kernels (for operations like RMSNorm, matrix multiplication, attention) so that inference becomes truly reproducible, and demonstrate this by implementing them in vLLM: with the new kernels, 1000 identical zero-temperature runs produce bitwise identical completions.
Title:Language Self-Play for Data-Free Training
AI Lab: Meta Superintelligence Labs, UC Berkeley
Summary: This paper introduces Language Self-Play (LSP), a reinforcement learning method where a single LLM improves by acting as both Challenger (query generator) and Solver (responder), removing reliance on external datasets. Experiments with Llama-3.2-3B show that LSP matches or surpasses data-driven RL baselines, demonstrating the feasibility of perpetual self-improvement without additional training data.
Title:SimpleQA Verified: A Reliable Factuality Benchmark to Measure Parametric Knowledge
AI Lab: Google DeepMind, Google Research
Summary: This work introduces SimpleQA Verified, a rigorously filtered 1,000-prompt benchmark that addresses noise, redundancy, and biases in OpenAI’s SimpleQA dataset. It provides a more reliable tool to measure LLM factuality, with Gemini 2.5 Pro achieving state-of-the-art performance, outperforming even GPT-5.
Title:WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents
AI Lab: MiniMax, HKUST, University of Waterloo
Summary: The paper presents WebExplorer, a framework that generates challenging web navigation QA pairs using model-based exploration and long-to-short iterative query evolution. Training the 8B-parameter WebExplorer model with this data enables state-of-the-art long-horizon reasoning, outperforming much larger models like WebSailor-72B across benchmarks such as BrowseComp and WebWalkerQA.
Title:Paper2Agent: Reimagining Research Papers as Interactive and Reliable AI Agents
AI Lab: Stanford University (Departments of Genetics, Biomedical Data Science, Biology, Computer Science)
Summary:Paper2Agent is a system that converts research papers into interactive AI agents by building Model Context Protocol (MCP) servers that expose methods, datasets, and workflows as callable tools. Case studies on AlphaGenome, TISSUE, and Scanpy show that these paper-agents can faithfully reproduce original results and support novel analyses, creating a new paradigm for dynamic scientific knowledge dissemination.
Title:An AI System to Help Scientists Write Expert-Level Empirical Software
AI Lab: Google DeepMind, Google Research, Harvard University, MIT, McGill, Caltech
Summary: This paper presents a system that combines large language models with tree search to automatically generate and refine scientific software for scorable tasks—problems where performance can be measured against a quality metric. Across benchmarks in genomics, epidemiology, geospatial analysis, neuroscience, time series forecasting, and numerical integration, the system produced expert-level and often state-of-the-art solutions, including 40 new methods for single-cell analysis and 14 models surpassing the CDC’s COVID-19 forecasting ensemble
🤖 AI Tech Releases
Qwen3-ASR
Alibaba released Qwen3-ASR, a new speech recognition model built on their multi-modal foundation.
ERNIE-4.5-21B-A3B-Thinking
Baidu released the latest iteration of its Baidu models.
MCP Registry
The MCP team open sourced the first version of the MCP Registry, an open catalog of MCP servers and clients.
Qwen-Next
Another impressive release by Alibaba, Qwen-Next is a hyper optimal model with training-stability-friendly optimizations, and a multi-token prediction mechanism for faster inference.
📡AI Radar
Cognition AI (maker of Devin) raised $400M at a $10.2B valuation, led by Founders Fund.
Perplexity closed a new $200M round at a $20B valuation, just weeks after its last raise.
OpenAI reportedly committed to buy about $300B of Oracle compute over five years starting in 2027—one of the largest cloud deals on record.
Berlin-based Born, maker of the virtual pet Pengu, raised a $15M Series A(Accel, Tencent, Laton) to expand its social AI companion lineup.
Databricks’ head of AI Naveen Rao is leaving to found a new computer startup aimed at novel architectures .
Alibaba shares jumped >7% as investors piled into its aggressive AI push, including new models and cloud expansion plans.
Micro1 raised $35M Series A at a $500M valuation to scale its human-intelligence platform—spanning the “Zara” AI recruiter and a post-training data engine for labs and enterprises.
Microsoft agreed via a nonbinding MOU to support OpenAIconverting its for-profit arm into a public benefit corporation, a move that could ease future fundraising.
Anchor co-founders launched Oboe, an AI learning app, building on their earlier $4M seed led by Eniac Ventures.
AegisAI, founded by former Google security leads, raised a $13M seed (Accel and Foundation Capital) to deploy autonomous agents that block email threats pre-inbox.
Mercor is in talks for a Series C at a $10B+ valuation on roughly $450M run-rate revenue, up from a $2B valuation in February. TechCrunch
Motion raised a $38M Series C and is valuing a quick follow-on C2 round at about $550M post, to build an integrated suite of AI agents for SMBs.
Isotopes AI (from Scale AI’s former CTO) emerged with a $20M seed to build an agent that plans and answers questions across enterprise data.
#000, #Agent, #Agents, #Ai, #AIAGENTS, #AIInfrastructure, #AIInterpretability, #Alibaba, #Amazing, #Amazon, #Analyses, #Analysis, #Analytics, #App, #Arm, #ArmsRace, #Art, #ASR, #Attention, #Autoencoders, #Autonomous, #AutonomousAgents, #Backlog, #Baidu, #Bank, #Benchmark, #Benchmarks, #Berlin, #Biases, #Biology, #Blackwell, #Blog, #Born, #Building, #Business, #C2, #CaseStudies, #Cdc, #Cell, #Change, #Cloud, #CloudInfrastructure, #Clouds, #Communication, #Compliance, #Computer, #ComputerScience, #Computing, #Concrete, #Concurrency, #Cooling, #Covid, #CTO, #Cutting, #Data, #DataScience, #DataDriven, #Database, #Databases, #Databricks, #Datasets, #Deals, #DeepMind, #DifferenceBetween, #Edge, #Editorial, #Email, #EmailThreats, #Endpoints, #Engine, #Enterprise, #Enterprises, #Epidemiology, #Ernie, #Eu, #Evolution, #Factories, #FLOPS, #Foundation, #Framework, #Fundraising, #Future, #Gb200, #Gemini, #Generator, #Genetics, #Genomics, #Geospatial, #Google, #GoogleDeepmind, #Governance, #GPT, #GPT5, #Gpu, #GPUs, #Graph, #Harvard, #Horizon, #Hospitals, #How, #HowTo, #Human, #Hyperscalers, #Industry, #Inference, #Infrastructure, #Integration, #Intelligence, #InteractiveAI, #Interpretability, #Isotopes, #It, #Json, #Language, #LanguageModel, #LanguageModels, #LargeLanguageModel, #LargeLanguageModels, #Learning, #LED, #LegacyDatabase, #Liquid, #Llama, #Llm, #LLMs, #Matrix, #MCP, #Measure, #Meta, #Method, #Microsoft, #Mid, #Mit, #Model, #ModelContextProtocol, #Models, #Mou, #MultiModal, #MultiTokenPrediction, #Multicloud, #Navigation, #Networking, #Neuroscience, #Node, #Noise, #One, #Openai, #Operations, #OPINION, #Oracle, #OracleDatabase, #Other, #PAID, #Paper, #Papers, #Parameter, #Performance, #Perplexity, #Pipelines, #Plane, #Platform, #Play, #Policies, #Precisely, #Production, #Profit, #Query, #Quiet, #Qwen, #Radar, #RAG, #Raise, #Reasoning, #Recovery, #Reduction, #ReinforcementLearning, #Reliance, #Reports, #Research, #Revenue, #Reviews, #Risks, #Robotics, #Scale, #Science, #Scientific, #Search, #Security, #Sensitive, #Servers, #Shares, #Shipping, #Slides, #SMBs, #Social, #Software, #SparseAutoencoders, #Speech, #SpeechRecognition, #Stack, #Stanford, #Startup, #Store, #Strategy, #Studies, #Superintelligence, #Teams, #Tech, #Temperature, #Tencent, #Thinking, #ThinkingMachines, #Threats, #Time, #TimeSeries, #TimeSeriesForecasting, #Tool, #Tools, #Training, #TrainingData, #Transformer, #Tree, #Trust, #Uc, #University, #Vector, #Vendor, #Version, #VLLM, #Web, #Work, #Workflows
Published on The Digital Insider at https://is.gd/awVFhX.
Comments
Post a Comment
Comments are moderated.