Sekėjai

Ieškoti šiame dienoraštyje

2025 m. rugpjūčio 9 d., šeštadienis

OpenAI Unveils Newest Chatbot --- Altman says model will be like talking with doctorate-level expert on any topic. How is it different from China’s DeepSeek models?

 

Both OpenAI's GPT-5 and DeepSeek's advanced AI models (such as DeepSeek-R1 and DeepSeek-V3) are at the forefront of large language model (LLM) development, showcasing advanced capabilities in natural language understanding, generation, and reasoning.

Here's a comparison of their strengths and differentiating factors:

OpenAI GPT-5

 

    Release Date: August 2025.

    Context Window: Supports a massive 400,000 token context window (equivalent to approximately 600 A4 pages), enabling it to retain information from long conversations and documents.

    Image Input Support: Offers image input capabilities, allowing it to process and analyze visual information alongside text.

    Multimodality: Combines text, image, and audio inputs with interactive outputs, enabling it to create real-time animated explanations.

    Coding: Excels at coding across multiple languages, frameworks, and complex project structures, capable of generating full applications from vague prompts and debugging with minimal input.

    Reasoning: Features a stronger reasoning core and broader memory capabilities compared to previous models, making it more aligned with how humans think and communicate.

    Safety: Incorporates an improved safety architecture that aims to provide helpful information while explaining limitations, rather than outright refusing to answer potentially unsafe questions.

    Integration: Designed to be easily integrated into various applications through the OpenAI API.

Data Privacy:  Cloud-based service, requiring from us careful handling of sensitive information.

 

DeepSeek's most advanced AI model (DeepSeek-R1 and DeepSeek-V3)

 

    Architecture: Utilizes a Mixture-of-Experts (MoE) architecture with 671 billion parameters in total, but only activates around 37 billion parameters for each token, improving efficiency and cost-effectiveness.

    Context Window: Supports a context length of up to 128,000 tokens, enabling it to process long conversations and documents.

    Reasoning: Designed for reasoning and complex task-solving, employing advanced reinforcement learning techniques. Strong in reasoning and mathematical problems, but slightly lags in some coding benchmarks.

    Efficiency: DeepSeek models are optimized for efficiency, aiming to reduce computational overhead while maintaining high performance.

    Open Source: DeepSeek-R1 is an open-source model, enabling developers and researchers to explore, modify, and deploy it within certain technical limits.

    Cost-Effectiveness: Training and inference costs are significantly lower than comparable models from OpenAI, according to DeepSeek.

    Local Deployment: Supports local deployment, offering users more control over data privacy and security. Open-source nature allows users to audit and self-host for enhanced privacy control.

 

In essence, GPT-5 prioritizes high performance and broad capabilities through a powerful and versatile architecture, while DeepSeek emphasizes efficiency, cost-effectiveness, and open-source accessibility.

The choice between the two models depends on individual needs and priorities. For developers seeking to build with an established, feature-rich AI at a potentially higher cost, GPT-5 might be the preferred option.

 

For those prioritizing cost-efficiency, transparency, and customization through an open-source model, DeepSeek presents a compelling alternative, particularly for domain-specific applications and research.

 

  

“OpenAI on Thursday launched GPT-5, its most advanced AI model, a culmination of more than two years of development hampered by setbacks and delays.

 

The release follows a year of massive revenue growth for the company, which is still losing money rapidly as it races to develop AI models with humanlike intelligence. OpenAI has largely shaken off concerns over its unstable governance structure and the loss of top executives, some of whom have started their own companies.

 

Interacting with the new technology "really feels like talking to an expert in any topic, like a Ph.D.-level expert," Chief Executive Sam Altman said on a call with reporters Wednesday. He emphasized its ability to allow anyone to create software applications by typing in simple, English-language prompts -- a process commonly referred to as "vibe coding."

 

In a prescripted demo, an OpenAI researcher asked ChatGPT to create a web application that could teach his partner French through interactive games and flashcards. The chatbot did so within minutes, writing more than 300 lines of code.

 

Investors are betting that technology advances like GPT-5 will continue powering the startup's growth and allow it to spearhead an economic transformation that is already well under way.

 

OpenAI is in early talks with venture firm Thrive Capital for a secondary stock sale that would value it at $500 billion, people familiar with the matter said, two-thirds more than its most recent round. The deal, earlier reported by Bloomberg, would make OpenAI the world's most-valuable private company, according to data firm CB Insights. Last week, the startup also raised $8.3 billion from a who's who of startup investors, including Sequoia Capital and Fidelity Management.

 

In June, OpenAI said it hit $10 billion in annualized revenue, a common metric used in the tech world that multiplies the previous month's revenue by 12. It didn't disclose how much it projected to lose over the same period.

 

OpenAI said GPT-5 became available to all at no cost starting Thursday.

 

Paying ChatGPT users, who get higher limits for model usage and access to a more powerful version of GPT-5 depending on what they pay for, also got the new model the same day. ChatGPT Education and Enterprise customers will receive access in a week.

 

Microsoft, OpenAI's closest partner and largest investor, had expected to review GPT-5 around mid-2024, The Wall Street Journal previously reported, but researchers encountered unexpected obstacles, including a shortage of high-quality data, early in the process. The setbacks fueled speculation that AI development was hitting a wall.

 

OpenAI found new breakthroughs by teaching AI models how to "think" for longer, building a set of reasoning models that it integrated into ChatGPT. In February, it also released GPT-4.5, a more incremental upgrade to GPT-4, which came out March 2023. Earlier this week, OpenAI released two "open-weight" models, which are freely available to the public. It stopped short of releasing them as open-source, where the full model architecture and training code are available.

 

In April, Altman rolled back an update to one of the models powering ChatGPT after users posted about the chatbot behaving in extremely sycophantic ways. OpenAI researchers said they trained GPT-5 to reduce this kind of behavior, improve on mental-health scenarios with users and explain its limitations more clearly. The company also said it spent 5,000 hours testing the technology for biased and harmful behavior.

 

The Journal owner News Corp has a content-licensing partnership with OpenAI.

 

The startup has been under intense pressure to build more advanced versions of its technology. After pioneering earlier versions of what are called language models, competitors including Google DeepMind and Anthropic have largely caught up. This summer, Meta Platforms also created an AI superintelligence lab, helmed by former startup CEO Alexandr Wang, that has poached at least 10 researchers from OpenAI.

 

OpenAI has a sizable lead among everyday users of generative AI but faces more competition winning over large business customers, many of whom use a mix of multiple models from different providers. The startup said it has five million paying business customers, including the Big Four accounting firm PricewaterhouseCoopers.” [1]

 

1. OpenAI Unveils Newest Chatbot --- Altman says model will be like talking with doctorate-level expert on any topic. Berber, Jin; Lin, Belle.  Wall Street Journal, Eastern edition; New York, N.Y.. 08 Aug 2025: B1. 

Komentarų nėra: