A new AI lab with legendary founders and unique ideas.
Next Week in The Sequence:
Our series about RAG continues exploring the different types of RAG methods. The Sequence Engineering dives into the ultra popular Eliza agentic framework which is breaking GitHub records. In our weekly essay we dive into something we have come to name: The DeepSeek Effect. I will let you speculate abou the details. We will publish an interview with one of the best early stage investors in the world. The research edition will dive into Large Action Models.
How is that for a single week?
You can subscribe to The Sequence below:
📝 Editorial: Remember this Name: Ndea
The days of a new AI lab emerging every month are long gone. It has become increasingly clear that building anything even remotely competitive with the industry leaders requires a rare combination of skills and engineering talent. Even heavily funded startups like Character.ai, Inflection, and Adept have struggled, and the market seems overly focused on transformer-based architectures. Meanwhile, even alternative approaches like structured state-space models (SSMs) appear to have reached a plateau. How many AI labs will pursue genuinely original ideas for foundation model architectures? Enter Ndea.
Francois Chollet, a renowned figure in AI—best known for his contributions to Keras—has set his sights on a new frontier with Ndea, an AI research and science lab dedicated to creating artificial general intelligence (AGI) for scientific advancement. Chollet’s vision is driven by his belief that the current AI trajectory, dominated by deep learning, has inherent limitations. While acknowledging its successes, he highlights its dependency on massive datasets and its lack of abstract reasoning, arguing that these shortcomings impede true progress.
Ndea seeks to transcend these limitations by pioneering a novel approach: guided program synthesis. Unlike deep learning, which interpolates between data points, program synthesis generates discrete programs that precisely encapsulate the observed data. This method promises superior generalization with significantly greater data efficiency, enabling models to learn from minimal examples. Though program synthesis is still in its infancy—comparable to the early stages of deep learning in 2012—Ndea believes in its transformative potential. The lab is committed to integrating program synthesis with deep learning, creating a synergistic system that combines intuitive pattern recognition with rigorous reasoning.
This approach sets Ndea apart in the rapidly evolving AI landscape. While some frontier AI labs are beginning to explore program synthesis, they often treat it as a supplementary tool. Ndea, on the other hand, positions program synthesis and deep learning as equally critical pillars for achieving AGI. Their bold vision is embodied in their goal of building a “factory for rapid scientific advancement” capable of driving groundbreaking discoveries across diverse fields. Ndea envisions AGI addressing not only known challenges like autonomous vehicles and drug discovery but also unlocking entirely new scientific frontiers, paving the way for advancements beyond current human comprehension.
To achieve these ambitious goals, Ndea is assembling a world-class team of program synthesis experts. Their emphasis on talent density is unwavering, aiming to foster an environment where innovation and rapid progress thrive. They believe that the success of their mission hinges on the collective brilliance of their team, fostering a collaborative culture where transformative ideas can take root. While acknowledging the risks inherent in such an ambitious endeavor, Ndea remains steadfast in its conviction that AGI holds the key to unparalleled scientific progress and, ultimately, human flourishing.
This version ensures grammatical accuracy, improves readability, and maintains a professional yet engaging tone.
🔎 AI Research
Transformer^2
In the paper "TRANSFORMER2: SELF-ADAPTIVE LLMS," researchers from Sakana AI and the Institute of Science Tokyo introduce Transformer2, a novel self-adaptation framework designed to improve the adaptability and task-specific performance of LLMs. Transformer2 employs a two-pass mechanism: a dispatch system identifies task properties, and task-specific "expert" vectors are dynamically mixed to achieve targeted behavior for incoming prompts12.
Process Reward Models
In the paper"The Lessons of Developing Process Reward Models in Mathematical Reasoning," researchers from the Qwen Team at Alibaba Group investigate Process Reward Models (PRMs) for enhancing the mathematical reasoning abilities of LLMs. They highlight the limitations of Monte Carlo estimation-based data synthesis for PRMs and propose a consensus filtering mechanism that integrates this method with LLM-as-a-judge for improved performance and data efficiency34.
HALOGEN
In the paper"HALOGEN: Fantastic LLM Hallucinations and Where to Find Them," researchers from the University of Washington, Google, and NVIDIA introduce HALOGEN, a benchmark designed to measure and identify hallucinations in LLM-generated text. They evaluate various LLMs across nine domains, finding a high prevalence of hallucinations and proposing a classification schema for different types of hallucination errors56.
PokerBench
In the paper PokerBench: Training Large Language Models to become Professional Poker Players, researchers from University of California, Berkeley and Georgia Institute of Technology explore the use of LLMs as poker solvers, evaluating their performance on a new benchmark called POKERBENCH. They find that existing LLMs struggle with optimal poker play but demonstrate significant improvement after fine-tuning.
MiniMax-01
In the paper "MiniMax-01: Scaling Foundation Models with Lightning Attention", researchers from MiniMax introduce the MiniMax-01 series of models, including MiniMax-Text-01 and MiniMax-VL-01. The key contribution is the implementation of lightning attention, a type of linear attention that allows these models to handle much longer context lengths (up to 4 million tokens) while maintaining performance comparable to state-of-the-art models like GPT-4 and Claude.123.
AI in TEEs
In the paper "Trusted Machine Learning Models Unlock Private Inference for Problems Currently Infeasible with Cryptography", researchers from Google explore the use of trusted machine learning models (TCME) within trusted execution environments (TEE) to address privacy concerns in collaborative tasks involving large language models (LLMs). The paper proposes a system where multiple parties can jointly query LLMs without revealing their private data, leveraging the security features of TEEs and the capabilities of TCMEs for tasks like competition analysis.
🤖 AI Tech Releases
AutoGen 0.4
Microsoft released a new version of its popular agentic framework.
Sky-T1
UC Berkeley researchers released Sky-T1 32B, a reasoning model that matches GPT-o1 performance and was trained with less than $450.
Devin 1.2
Cognition released Devin 1.2, the new iteration of its AI engineering agent.
📡AI Radar
AI legend François Chollet is launching Ndea, a new startup to build frontier AI systems.
Video AI platform Synthesia raised $180 million in new funding.
AI orchestration startup Nexos.ai emerged from stealth with $8 million in funding.
Bioptimus, a startup building a GPT for biology, raised $76 million.
Axios and OpenAI announced a strategic partnership.
Google and Associated Press signed a partnership to bring up-to-date news to Gemini.
AI accounting startup Open Ledger raised $3 million in funding.
Microsoft launched a new AI engineering division.
Rasperry AI raised $24 million for its AI platform for fashion brands.
AI legal startup Harvey raised a new round at a $3 billion valuation.
Perplexity acquired the team behind Read.cv, a social network for professionals.
#Accounting, #Agent, #AGI, #Ai, #AIEngineering, #AILegal, #AiPlatform, #AIResearch, #AISystems, #Alibaba, #Analysis, #Approach, #Art, #Artificial, #ArtificialGeneralIntelligence, #Attention, #Autonomous, #AutonomousVehicles, #Behavior, #Benchmark, #Billion, #Biology, #Brands, #Building, #California, #CharacterAI, #Claude, #Collaborative, #Collective, #Competition, #Comprehension, #Cryptography, #CV, #Data, #Datasets, #DeepLearning, #Deepseek, #Details, #Discoveries, #Domains, #Driving, #Drug, #DrugDiscovery, #Editorial, #Efficiency, #Emphasis, #Engineering, #Environment, #Fashion, #Features, #Foundation, #FoundationModels, #Framework, #FrontierAi, #Funding, #Gemini, #Github, #Google, #GPT, #GPT4, #Hallucination, #Hallucinations, #Hand, #How, #Human, #Ideas, #Industry, #Inference, #Inflection, #Innovation, #Intelligence, #INterview, #It, #Landscape, #Language, #LanguageModels, #LargeActionModels, #LargeLanguageModels, #Learn, #Learning, #Legal, #LESS, #Llm, #LLMHallucinations, #LLMAsAJudge, #LLMs, #MachineLearning, #Mathematical, #MathematicalReasoning, #Measure, #Method, #Microsoft, #Model, #Models, #Network, #News, #Nexos, #NexosAi, #Nvidia, #O1, #One, #Openai, #Orchestration, #Other, #PAID, #Paper, #Partnership, #Performance, #Perplexity, #Platform, #Play, #Poker, #Precisely, #Privacy, #Process, #Query, #Qwen, #Radar, #RAG, #Read, #Research, #Risks, #SakanaAI, #Scaling, #Science, #Scientific, #Security, #Skills, #Sky, #Social, #Space, #SSMs, #Startup, #Startups, #Stealth, #Success, #Synthesia, #Synthesis, #Talent, #Tech, #Technology, #Text, #Tool, #Training, #Transformer, #Tuning, #University, #Vectors, #Vehicles, #Version, #Video, #Vision, #Work, #World
Published on The Digital Insider at https://is.gd/arTkJK.
Comments
Post a Comment
Comments are moderated.