OpenAI GPT-5.4 officially released: touted as the strongest model

The release of GPT-5.4 has broken the limitations of previous scattered model capabilities. OpenAI has for the first time integrated core capabilities such as inference, coding, computer usage, and deep web search to create a "unified" professional level model, without sacrificing individual performance due to the integration of multiple capabilities. This truly achieves the core goal of "stronger, faster, and more efficient", which is also the core confidence of OpenAI's courage to claim the "strongest model".

This release simultaneously introduces two major versions, adapted to different usage needs: GPT-5.4 Thinking version for ChatGPT and API users, focusing on daily professional work assistance; The GPT-5.4 Pro version for complex tasks focuses on advanced professional scenarios, and both were launched simultaneously on the day of release, fully replacing the previous GPT-5.2 Thinking version. The latter will be retained as a "legacy model" and officially retired on June 5, 2026, ensuring a smooth transition for users.

Compared to the previous generation model, GPT-5.4 has achieved a comprehensive leap in core capabilities, covering key areas such as professional knowledge work, computer operation, coding and debugging. Each upgrade accurately meets the actual needs of professional workers, greatly improving work efficiency.

In the field of professional knowledge work, GPT-5.4 performs particularly well. In the GDPval benchmark test covering 44 occupational fields, the model achieved or exceeded industry expertise in 83.0% of projects, far ahead of the 70.9% achieved by GPT-5.2. In the investment bank level spreadsheet modeling test, its average score reached 87.3%, significantly higher than the previous generation's 68.4%; The generated presentation is also more recognized, with 68.0% of reviewers preferring its output. The core advantage lies in better aesthetic design and visual presentation. At the same time, the accuracy of the model has significantly improved, with a 33% reduction in the error rate of individual statements and an 18% reduction in the probability of complete answer errors, making it OpenAI's most "rigorous" model to date.

The native interaction capability of computers is a major highlight of this upgrade. GPT-5.4 has become OpenAI's first general-purpose model with native computer capabilities, which can operate computers through screenshots and keyboard and mouse commands to complete complex workflows across applications. In the OSWorld Verified desktop benchmark test, its success rate reached 75.0%, not only far ahead of the 47.3% of the previous generation, but also exceeding the human average of 72.4%; In web operation testing, a success rate of 92.8% can be achieved with just screenshots, greatly improving the practical ability of the model.

The coding and tool usage capabilities have also been iteratively upgraded. The model integrates the encoding advantages of GPT-5.3-Cdex and performs equally or even better in the SWE Bench Pro benchmark test, with lower latency. The "/fast" mode can improve token speed by 1.5 times. The newly added "tool search" function can reduce token consumption by 47% while maintaining accuracy. In multi-step tool calling tasks, higher accuracy can be achieved through less interaction, significantly reducing development and usage costs. In addition, the model supports context windows of up to 1 million tokens, making it easy to handle complex tasks such as entire code repositories and long documents, completely solving the pain point of "context discontinuity" in previous models.