Anthropic unveils new Claude AI models and ‘computer control’

Graphic from Anthropic alongside the debut of their new Claude 3.5 AI models, including Sonnet and Haiku, and a new feature enabling computer control by artificial intelligence.

.pp-multiple-authors-boxes-wrapper display:none;
img width:100%;

Anthropic has announced upgrades to its AI portfolio, including an enhanced Claude 3.5 Sonnet model and the introduction of Claude 3.5 Haiku, alongside a “computer control” feature in public beta.

The upgraded Claude 3.5 Sonnet demonstrates substantial improvements across all metrics, with particularly notable advances in coding capabilities. The model achieved an impressive 49.0% on the SWE-bench Verified benchmark, surpassing all publicly available models, including OpenAI’s offerings and specialist coding systems.

In a pioneering development, Anthropic has introduced computer use functionality that enables Claude to interact with computers similarly to humans: viewing screens, controlling cursors, clicking, and typing. This capability, currently in public beta, marks Claude 3.5 Sonnet as the first frontier AI model to offer such functionality.

[embedded content]

Several major technology firms have already begun implementing these new capabilities.

“The upgraded Claude 3.5 Sonnet represents a significant leap for AI-powered coding,” reports GitLab, which noted up to 10% stronger reasoning across use cases without additional latency.

The new Claude 3.5 Haiku model, set for release later this month, matches the performance of the previous Claude 3 Opus whilst maintaining cost-effectiveness and speed. It notably achieved 40.6% on SWE-bench Verified, outperforming many competitive models including the original Claude 3.5 Sonnet and GPT-4o.

Model benchmarks comparing new Claude AI models from Anthropic. — *(Credit: Anthropic)*

Regarding computer control capabilities, Anthropic has taken a measured approach, acknowledging current limitations whilst highlighting potential. On the OSWorld benchmark, which evaluates computer interface navigation, Claude 3.5 Sonnet achieved 14.9% in screenshot-only tests, significantly outperforming the next-best system’s 7.8%.

The developments have undergone rigorous safety evaluations, with pre-deployment testing conducted in partnership with both the US and UK AI Safety Institutes. Anthropic maintains that the ASL-2 Standard, as detailed in their Responsible Scaling Policy, remains appropriate for these models.

(Image Credit: Anthropic)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

Tags: ai, anthropic, artificial intelligence, claude, haiku, llm, models, sonnet

#Ai, #AiBigDataExpo, #AiModel, #AIModels, #AiSafety, #AIPowered, #Amp, #Anthropic, #Applications, #Approach, #Articles, #Artificial, #ArtificialIntelligence, #Automation, #Benchmark, #BigData, #California, #Chatbots, #Claude, #Claude3, #Claude35, #Claude35Sonnet, #Cloud, #Coding, #Companies, #Comprehensive, #Computer, #Computers, #Conference, #Cyber, #CyberSecurity, #Data, #Deployment, #Development, #Developments, #DigitalTransformation, #Enterprise, #Event, #Events, #Forrester, #FrontierAi, #Gitlab, #GPT, #Gpt4O, #Granite3, #Haiku, #Humans, #IBM, #Industry, #Intelligence, #IntelligentAutomation, #Interviews, #It, #Latency, #Leadership, #Learn, #Llm, #London, #Media, #Metrics, #Model, #Models, #Navigation, #Openai, #Opus, #Other, #Partnership, #Performance, #Policy, #Reports, #Safety, #Scaling, #Screenshot, #Security, #Social, #Sonnet, #Speed, #Technology, #Testing, #Transformation, #UK, #X

Published on The Digital Insider at https://is.gd/FGFF10.

Julio Marchi © Speaks Out Network

Search This Blog

Anthropic unveils new Claude AI models and ‘computer control’ | By The Digital Insider

Comments

Post a Comment