VersusAI giants clash: Ascend 950 challenges Nvidia H200 and AMD MI300 with...

AI giants clash: Ascend 950 challenges Nvidia H200 and AMD MI300 with low-res tricks

  • The Huawei Ascend 950DT FP8 format is designed to provide efficient results without compromising accuracy.
  • The Nvidia H200 harnesses the power of Hopper’s proven software and GPU ecosystem.
  • The FP64 parity of the AMD Instinct MI300 processor allows you to solve serious scientific problems.

In recent years, the demand for training in artificial intelligence and logical computing has led chip manufacturers to actively innovate. Efficiency in memory bandwidth, data formats, interconnects, and overall computing performance are now as important as data flow operations.

Companies are focusing on complex scenarios such as AI training and high-performance computing, and AI tools are increasingly relying on high-speed accelerators to process large data sets.

- Advertisement -

Many brands are solving this problem through various computing platform capabilities. So we try to understand the differences and explain how the Ascend 950, H200 and Mi300 Instinct series compare.

Architecture and design methods.

Huawei’s Ascend 950 series uses a unique AI accelerator architecture that is optimized for the decoding, inference and model training stages instead of traditional GPUs.

It is designed to combine SIMD and SIMT processing modes with 128-byte memory access granularity to balance performance and flexibility.

Nvidia’s H200 is based on the Hopper GPU architecture and features 16,896 CUDA cores and 528 fourth-generation Tensor cores.

It uses the GH100 single-chip GPU built on TSMC’s 5nm process to ensure compatibility with Nvidia’s software suite and broader ecosystem.

- Advertisement -

AMD’s Mi300 Instinct processor is designed with an Aqua Vanjaram GPU with CDNA 3.0 architecture and an MCM chipset with 220 computing units and 880 structural cores.

This approach saves a lot of transistor budget and focuses on high-performance computing.

The Ascend 950 offers 1 petaflops of maximum performance when using FP8, MXFP8 or HiF8 data formats, increasing to 2 petaflops when using MXFP4.

This highlights Huawei’s focus on new low-fidelity formats designed to increase output efficiency without sacrificing accuracy.

The Nvidia H200 offers 241.3 teraflops of performance at FP16 and 60.3 teraflops of performance at FP32. The AMD MI300 processor, on the other hand, offers 383 teraflops of performance on FP16 and approximately 48 teraflops of performance on FP32 and FP64 workloads.

- Advertisement -

The Mi300’s FP64 to FP32 ratio highlights its suitability for scientific computing where double precision is important, but Nvidia is targeting mixed precision acceleration for artificial intelligence.

Memory structure strongly affects the learning of large linguistic models.

Huawei has equipped the Ascend 950 with 144GB HBM HiZQ 2.0, which provides up to 4TB/s bandwidth and 2TB/s connection speed.

Nvidia equipped the H200 with 141GB of HBM3e memory and 4.89TB/s bandwidth, which should provide slightly better performance.

AMD MI300 has 128 GB of HBM3 memory, but has a wider 8192-bit bus and a higher memory bandwidth of 6.55 TB/s.

For training large models or memory-intensive simulations, AMD’s bandwidth advantage can lead to faster data movement, although AMD’s total memory capacity is less than Huawei’s.

The H200 and MI300 have a TDP of 600W and support non-video PCIe 5.0 x16 server configurations designed for data center use.

While Huawei has not yet revealed official TDP figures, it does offer integrated SuperPOD servers and card formats, demonstrating the deployment flexibility of its AI infrastructure solutions.

2 Tbps interconnect capability may be critical for multi-chip scaling in data center environments, but chip size and transistor count remain proprietary.

Nvidia uses the mature NVLink and InfiniBand ecosystem, and AMD’s multi-chip modules are designed to reduce latency between computing chips.

Huawei’s goal is obviously to make the Ascend 950 usable for large-scale generative AI training and decoding stage inference, a market Nvidia has long dominated.

A launch in the fourth quarter of 2026 means that Nvidia’s H200, launched in late 2024, and AMD’s MI300, launched in early 2023, already have a time advantage.

By the time Ascend reached 950 customers, both competitors had copied the platform.

However, Huawei’s focus on efficient low-resolution formats and tight integration with networking equipment may appeal to buyers looking for alternatives to American suppliers.

However, these drivers reflect different philosophies between brands.

AMD prioritizes memory bandwidth and double precision capabilities for HPC workloads, while Nvidia leverages ecosystem maturity and software support to maintain its leadership in AI enablement.

Huawei aims to challenge the powerful FP8 level performance and large native memory.

via Huawei Nvidia, The power of technology

you may also like

  • Here are the best portable workstations you can buy today
  • We have also listed the best mini PCs for every budget.
  • Intel to build custom x86 processors for Nvidia’s AI infrastructure

More From NewForTech

WordPress vs. Squarespace: Which CMS is Better in 2025?

WordPress is the most popular CMS platform in the...