The Sequence Radar #731: Rails, Windows, and Shots — Tinker, DeepSeek V3.2, Sora 2, and Periodic’s $300M | By The Digital Insider
An impressive week of AI releases.
Next Week in The Sequence:
The Sequence Knowledge: We explore the idea of a transformer model for AI interpretability. What????
The Sequence AI of the Week: Gotta go with DeepSeek 3.2 sparse attention mechanisms.
The Sequence Opinion: After the release of Periodic, we are going to discuss the potential of transformers for science.
Subscribe Now to Not Miss Anything:
📝 Editorial: Last Week in AI: Rails, Windows, and Shots — Tinker, DeepSeek V3.2, Sora 2, and Periodic’s $300M
I wanted to try a different format for today’s editorial. The world of AI is moving so fast that is nearly impossible to keep up. I wanted to use this Sunday’s editorial to highlight a handful of the key developments in AI that you definitely need to keep an eye on. This week seems like a great week to start.
This week was about raising the ceiling while lowering the friction. Thinking Machines’ Tinker makes post‑training feel like writing experiments, not DevOps; DeepSeek‑V3.2‑Exp turns long‑context efficiency into something you can actually budget for; Sora 2 collapses video and audio generation into a single controllable stack; and Periodic emerged from stealth with a $300 million seed to automate science. The common theme: faster iteration loops, cheaper context, and tighter control surfaces across research, production, and even wet‑lab discovery.
Tinker compresses the path from idea to tuned model. The new API/SDK lets teams keep bespoke ingredients—datasets, losses, schedulers, and RL/SL recipes—while outsourcing brittle distributed training chores (multi‑GPU orchestration, fault tolerance, logging, artifacts). You point Tinker at an open‑weights base (Llama/Qwen/MoE variants), declare the regimen (SFT, DPO/RLAIF, PEFT/LoRA, long‑context packing), and get reproducible runs with minimal infrastructure drag. The strategic bet is clear: a standardized, open‑leaning post‑training rail that shortens the cycle from paper → prototype → policy‑compliant model.
DeepSeek‑V3.2‑Exp normalizes sparse attention for long contexts. Instead of paying full attention cost per token, V3.2 routes interactions through a structured sparsity pattern that trims FLOPs and memory without cratering quality on retrieval‑heavy tasks. Practically, that means more documents in‑frame, higher tokens/sec per GPU, and better economics for agents that reason across large evidence windows. Expect the ripple effects to land in kernel ports, RoPE/relative‑position tweaks, and prompt patterns that assume long context is the default rather than a luxury.
Sora 2 unifies video realism and synchronized audio. The upgrade tightens physical dynamics (materials, occlusions, cloth/fluids) and improves temporal coherence while adding in‑model dialogue and SFX alignment. Collapsing the traditional two‑model workflow (video + post‑hoc audio) reduces artifacts at scene boundaries and gives product teams a single surface for safety and likeness controls. For pipelines in ads, pre‑viz, simulation, or data augmentation, this simplifies toolchains and shifts more creative control inside one forward pass.
Periodic’s $300M seed pushes AI deeper into experimental science. Founded by alumni across top labs, Periodic is building an AI‑native, closed‑loop stack for discovery: foundation models that propose candidates or protocols, robotic/automated labs that execute, and feedback signals that update models in near‑real‑time. If they can integrate design‑→‑synthesis‑→‑measurement in a stable loop, the cost of exploring huge hypothesis spaces collapses—opening routes to faster materials, chemistry, and bio workflows. For ML teams, this is a signal that “AI for Science” is shifting from offline modeling to online optimization with real‑world actuators.
The pace: weeks feel like product cycles. The cadence is now so fast that capabilities arrive before our playbooks solidify—long context goes from premium to default; post‑training rails go from bespoke to standardized; video+audio control goes from pipeline gymnastics to a single call. Treat roadmaps as living documents: shorten planning horizons, keep eval harnesses always‑on, budget for sudden price/perf step‑changes, and assume rolling deprecations in APIs and prompts. The only stable posture is continuous recalibration—ship smaller, instrument more, and expect next week to redraw the boundary of what’s practical.
🔎 AI Research
LoRA Without Regret, Thinking Machines Lab, Summary
AI Lab: Thinking Machines
The post argues that with the right setup—much higher learning rates, applying LoRA across layers, and modern RL/SFT recipes—LoRA often matches full fine-tuning for common post-training workloads while being cheaper, faster, and preserving base-model skills. It also notes limits: LoRA can underperform on very large supervised datasets (e.g., continued pretraining), but is a strong default for small-to-medium SFT and RL.
DeepSeek-V3.2-Exp, DeepSeek-AI, Summary
AI Lab: DeepSeek
Summary: Introduces DeepSeek Sparse Attention (with a light “indexer” + top-k token selection) to speed long-context training/inference while largely matching V3.1-Terminus on core benchmarks. Trained via continued pre-training to 128K context and post-training with mixed RL, it shows sizable end-to-end cost reductions at long sequence lengths.
CWM: An Open-Weights LLM for Research on Code Generation with World Models
AI Lab: Meta FAIR
Summary: A 32B dense model mid-trained on Python execution traces and agentic Docker interactions to “learn the world” of code, supporting 131k context and strong coding/math scores (e.g., SWE-bench Verified with test-time scaling). The release includes pretrain/SFT/RL checkpoints and a framework to study how world modeling aids reasoning, planning, and agentic coding.
Reinforced Generation of Combinatorial Structures: Applications to Complexity Theory
AI Lab: Google DeepMind & Google
Summary: Uses the AlphaEvolve coding agent to discover finite structures (e.g., near-extremal Ramanujan graphs and MAX-k-CUT gadgets), tightening average-case certification bounds for random regular graphs and improving NP-hardness inapproximability factors for MAX-3-CUT/4-CUT. A key ingredient is evolving much faster (up to 10,000×) verification code to explore larger candidates while maintaining correctness checks.
Mem-α: Learning Memory Construction via Reinforcement Learning
AI Lab: Anuttacon & UC San Diego (with Stanford)
Summary: Proposes Mem-α, an RL framework that trains LLM agents to choose and use memory tools (core, episodic, semantic) with rewards from QA accuracy, tool-call success, compression, and content validity. It outperforms prior memory-augmented agents and generalizes from ~30K-token training to >400K-token sequences.
Ferret-UI Lite: Lessons from Building Small On-Device GUI Agents
AI Lab: Apple
Summary: Introduces a 3B end-to-end on-device GUI agent trained with a unified action space, synthetic/real GUI data, CoT, zoom-in visual tool-use, and RL with verifiable rewards. It achieves strong grounding (e.g., 91.6% ScreenSpot-V2, 53.3% ScreenSpot-Pro) and competitive navigation (28.0% AndroidWorld; up to 19.8% OSWorld) while highlighting remaining long-horizon limits for small models.
Regression Language Models for Code
AI Lab: Cornell University & Google (DeepMind)
Summary: Presents a unified Regression Language Model (RLM) initialized from T5Gemma that directly regresses code-to-metrics across languages, Triton kernels, and ONNX graphs using digit-by-digit numeric decoding and multi-objective outputs. A 300M-param RLM attains >0.9 Spearman on APPS memory, ~0.5+ average Spearman across 17 CodeNet languages, and state-of-the-art Kendall-τ on several NAS spaces while predicting hardware latencies.
🤖 AI Tech Releases
Sora v2
OpenAI released Sora 2, the newest iteration of its video generation model with hyper realistic and more controllable videos. .
DeepSeek 3.2
In addition to the research paper, DeepSeek released a new model that uses a sparse attention mechanism to optimize inference costs.
Agent Framework
Microsoft released Agent Framework, a new platform for building and managnig multi-agent systems.
GLM 4.6
Ziphu AI released a new version of its marquee GLM model with major improvements in coding, reasoning and agentic tasks.
Granite 4
IBM released Granite 4 which features a hybrid SSM/transformer architecture.
📡AI Radar
Anthropic brings in ex-Stripe CTO Rahul Patil to tighten the loop between product, infra, and inference as Sam McCandlish shifts to chief architect.
Google pushes Jules deeper into real dev workflows with a new CLI (“Jules Tools”) and early Jules API so teams can wire the agent into CI/CD and terminals. Original: blog.google post.
Perplexity’s Comet browser goes free for everyone, with Max adding a background assistant that runs tasks while you work. Original: Perplexity blog.
Cerebras raises a massive $1.1B Series G at an $8.1B valuation to scale wafer-scale AI compute and cloud inference. Original: Cerebras press release.
Perplexity is absorbing the Visual Electric team to build new “Agent Experiences,” while VE sunsets over 90 days with refunds and export tools. Original: Visual Electric announcement.
Periodic Labs debuts from stealth with $300M to build “AI scientists” and autonomous labs that generate novel experimental data. Original: Periodic’s launch page.
Alex raises $17M to let an AI recruiter run first-round interviews and screens for large employers. Original: Alex company site. (TechCrunch)
“Anything” hits $2M ARR in two weeks and a $100M valuation off an $11M round to turn natural-language prompts into production apps with in-house infra.
Paid (from Outreach founder Manny Medina) lands $21.6M to power results-based billing, margins, and pricing for AI agent companies. Original: paid.ai.
Databricks will ship OpenAI models natively across its platform as part of a $100M push to accelerate enterprise agent adoption (Agent Bricks). Original: Databricks newsroom.
BlackRock’s GIP is in advanced talks to buy Aligned Data Centers for ~$40B, a bet on AI-driven capacity build-outs.
Published on The Digital Insider at https://is.gd/XGps7W.
Comments
Post a Comment
Comments are moderated.