OpenAI Group PBC today introduced GPT-5.6, a new series of large language models that it says can outperform Claude Mythos 5 across certain coding tasks.
The most advanced algorithm in the lineup is known as Sol. It’s available alongside a midrange option called Terra and an entry-level model dubbed Luna.
All three artificial intelligence models come with two modes that weren’t included in GPT-5.5. The first is a “max” setting that increases the amount of time GPT-5.6 spends on a task to boost reasoning quality. Additionally, OpenAI has developed an “ultra” mode that can spin up multiple subagents to do work in parallel.
The company describes Sol as the most capable LLM it has built to date. The model scored 88.8% on a popular AI benchmark called TerminalBench-2.1 that includes 89 complex programming tasks. When the company enabled the “ultra” setting, Sol’s score increased to 91.9%. Anthropic PBC’s flagship Claude Mythos 5 model managed 88%.
Claude Mythos 5 was preceded by a model called Mythos Preview that made its debut in April. According to Anthropic, the latter LLM has identified more than 10,000 high-severity and critical software vulnerabilities. OpenAI says that Sol nearly matches Mythos Preview’s performance on a cybersecurity research benchmark called ExploitBench.
The GPT-5.6 series also brings efficiency improvements. OpenAI had Sol tackle GeneBench v1, a collection of scientific data analysis tasks that it released in April. The model matched the performance of the company’s previous flagship LLM using fewer tokens.
Sol includes guardrails designed to prevent it from supporting malicious activities such as developing hacking campaigns. If the controls fail to prevent the LLM from generating harmful output, a specialized large reasoning model filters the prompt response before it reaches the user.
OpenAI says the GPT-5.6 series can not only block risky requests but also fend off cyberattacks. The company ran a series of red-teaming exercises to find universal jailbreaks, hacking tactics that can be used to create not one but multiple malicious prompts.
Some of the tests were carried out automatically using “700,000 A100-equivalent GPU hours.” OpenAI used the test findings to improve its new model lineup’s security.
Terra and Luna, the two lower-end GPT-5.6 models that debuted alongside Sol, trade off some output quality for increased cost-efficiency. Sol is priced at $5 per million input tokens and $30 per million output tokens. Terra costs half as much, while Luna offers 80% lower rates.
At the request of the U.S. government, OpenAI is limiting GPT-6.5 access to a ”small group of trusted partners” on launch. The company plans to move the LLM series into general availability in a few weeks. Additionally, OpenAI will bring Sol to newly public Cerebras Systems Inc.’s WSE-3 wafer-size AI chip.
Photo: Focal Foto/Flickr
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
- 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
- 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.



