“Two Chinese AI chip companies aim to raise a combined $1.66 billion through initial public offerings, as China steps up efforts to achieve chip independence amid an escalating U.S-China tech competition.
Beijing-based Moore Threads plans to raise 8 billion yuan, equivalent to $1.12 billion, while Shanghai-based MetaX is targeting 3.9 billion yuan, according to prospectuses filed Monday with the Shanghai Stock Exchange.
Founded in 2020 by former Nvidia executive Zhang Jianzhong, Moore Threads specializes in designing graphics-processing units for AI training. The company plans to use the IPO proceeds to fund new AI chip research and development and to bolster working capital.
MetaX, also founded in 2020 by former AMD employees including Chairman Chen Weiliang, focuses on full-stack [1] GPU chips and related solutions. It intends to use the funds to support high-performance GPU R&D.
Both companies aim to list on Shanghai's STAR Market, the tech-focused board of the Shanghai Stock Exchange.
Moore Threads was added to the U.S. entity list in October 2023, which restricts its access to American technology and equipment.
Despite rapid revenue growth, both companies continue to post steep losses as they expand and invest heavily in research and development.
Moore Threads' revenue more than tripled to 438.85 million yuan in 2024, while its net loss narrowed but remained at 1.49 billion yuan.
MetaX's revenue surged more than tenfold to 743.1 million in 2024, up from 53 million a year earlier. However, it posted a net loss of 232.5 million yuan, attributing it to the low market penetration of domestically produced chips, limited sales scale of its self-developed GPUs and high R&D costs.” [3]
1. Full-stack GPU chips refer to companies that offer more than just the graphics processing unit (GPU) silicon
. They provide a complete solution, including the chip, associated systems, and the software ecosystem to support it.
NVIDIA is a prime example of a company providing full-stack solutions, particularly in AI.
A full-stack GPU solution includes:
Chips: The GPU itself (such as the Blackwell GPU), along with supporting chips such as CPUs (Grace CPU), data processing units (BlueField), network interface cards (ConnectX), and switches (NVLink Switch, Spectrum Ethernet switch, Quantum InfiniBand switch).
Systems: Hardware platforms that house and integrate the various chips, such as the NVIDIA GB200 NVL72 rack-scale solution.
Software: The software ecosystem allows developers to leverage the hardware's power. This includes:
Optimization for specific workloads like AI inference.
Software libraries and tools for tasks like quantization [2].
Toolkits, libraries, and compilers for developing high-performance applications, such as CUDA (NVIDIA), ROCm (AMD), and oneAPI (Intel).
Debugging and performance analysis tools.
Companies Offering Full-Stack GPU Solutions:
NVIDIA: A leading provider of full-stack GPU solutions, particularly for AI and high-performance computing. NVIDIA integrates hardware and software to optimize performance for demanding applications.
Other contenders: Other companies are also working on integrated hardware and software solutions to compete in the GPU market.
A full-stack GPU chip (or solution) goes beyond the hardware to include the software and systems needed to deliver a complete, optimized computing platform. This approach is particularly relevant in areas like AI.
2. Quantization is the process of converting continuous data into discrete, digital representations. This involves reducing the precision of data, often from a higher-precision format (like 32-bit floating-point) to a lower-precision format (like 8-bit integers). This technique is widely used to reduce memory usage, improve computational efficiency, and enable deployment on resource-constrained devices.
In more detail:
Core Concept:
Quantization maps a continuous range of values to a smaller, discrete set of values. Think of it like rounding numbers to the nearest whole number or representing colors using a limited palette.
Applications:
Machine Learning: Reduces model size, speeds up inference (the process of using a trained model to make predictions), and enables deployment on edge devices.
Signal Processing: Converts continuous signals (like audio or video) into a digital format that can be processed by computers.
Music Production: Aligns musical notes to a timing grid to correct for timing imperfections.
Image Processing: Reduces the number of colors used in an image, often for compression or to display images on devices with limited color support.
Physics: In quantum physics, energy, momentum, and other quantities are quantized, meaning they can only take on specific discrete values.
Benefits:
Smaller Model Sizes: Reduced memory footprint for storage and faster loading times.
Faster Inference: Integer arithmetic operations are generally faster than floating-point operations, leading to quicker predictions.
Reduced Energy Consumption: Less energy is required to process data in lower precision.
Trade-offs:
Loss of Precision: Reducing precision introduces quantization error, which can affect model accuracy.
Finding the Right Balance: The goal is to find the right balance between reducing model size and maintaining acceptable accuracy.
Techniques:
Post-Training Quantization (PTQ): Quantizes a model after it has been trained, without requiring further training data.
Quantization-Aware Training (QAT): Incorporates quantization into the training process to mitigate accuracy loss.
Various data types: Common data types used in quantization include 8-bit integers (int8), 16-bit floats (fp16), and brain float 16 (bf16).
3. Chinese AI Chipmakers Plan IPOs. Qin, Sherry. Wall Street Journal, Eastern edition; New York, N.Y.. 02 July 2025: B3.
Komentarų nėra:
Rašyti komentarą