TL;DR

The latest GLM5.2 model runs on AMD MI355X hardware at 2626 tokens per second per node, achieving more than double the performance per dollar than NVIDIA’s Blackwell. This could reshape AI hardware competitiveness.

Benchmark results confirm that the GLM5.2 language model achieves 2626 tokens per second per node when run on AMD’s MI355X hardware, with a cost advantage of more than two times lower than NVIDIA’s Blackwell platform. This performance metric underscores a potential shift in AI hardware economics and efficiency, making AMD a more competitive player in large-scale AI deployment.

The benchmark, conducted by an independent testing group, indicates that GLM5.2 can process 2626 tok/s/node on AMD’s MI355X, a data center GPU designed for AI workloads. The performance is reported to be over twice as cost-effective compared to NVIDIA’s Blackwell architecture, which has been a dominant force in AI acceleration. AMD’s MI355X, featuring advanced architecture and optimized for large language models, appears to offer a compelling alternative for organizations seeking efficiency and cost savings.

AMD officials have not yet confirmed the detailed testing methodology or the specific hardware configuration used, but the results have been shared by a third-party benchmarking entity. The figures suggest that AMD’s solutions could challenge NVIDIA’s market dominance, especially in environments where cost per performance is a critical factor.

At a glance
reportWhen: announced March 2024
The developmentBenchmark results reveal GLM5.2 on AMD MI355X hardware attains 2626 tok/s/node at over 2x lower cost than Blackwell, highlighting a significant shift in AI hardware efficiency.

Potential Impact on AI Hardware Market Competition

This development could significantly alter the landscape of AI hardware deployment, as organizations may now consider AMD’s MI355X as a more cost-effective alternative to NVIDIA’s Blackwell. The ability to achieve higher token throughput at less than half the cost could lead to broader adoption of AMD hardware in data centers, startups, and large enterprises, potentially driving increased competition and innovation in AI acceleration technology.

Furthermore, this performance metric highlights the importance of hardware efficiency in scaling large language models, which are central to many AI applications today. If validated, these results could influence future hardware procurement decisions and accelerate AMD’s entry into high-performance AI markets.

Kelinx AISURIX RX 580 Graphics Card, 2048SP, Real 8GB, GDDR5, 256 Bit, Pc Gaming Video Card, 2XDP, HDMI, PCI Express 3.0 with Freeze Fan Stop for Desktop Computer Gaming Gpu

Kelinx AISURIX RX 580 Graphics Card, 2048SP, Real 8GB, GDDR5, 256 Bit, Pc Gaming Video Card, 2XDP, HDMI, PCI Express 3.0 with Freeze Fan Stop for Desktop Computer Gaming Gpu

【Arctic Islands architecture and Superior Gaminig Experience】RX 580 8G is a mainstream gaming GPU built on the 14…

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Recent Trends in AI Hardware Performance and Cost

Over the past year, the AI hardware market has seen rapid advancements, with NVIDIA’s Blackwell architecture setting a high benchmark for token processing speed and efficiency. Blackwell’s GPUs have been widely adopted for training and inference of large language models, but their high cost has been a limiting factor for some organizations.

Meanwhile, AMD has been investing heavily in AI-specific hardware, with the MI355X series designed to compete directly with NVIDIA’s offerings. Prior to this benchmark, AMD’s solutions were considered less mature for large-scale AI workloads, though recent developments suggest that this gap may be closing. The new GLM5.2 results could mark a turning point, emphasizing the importance of performance-per-dollar metrics in evaluating AI hardware options.

“The performance of GLM5.2 on our MI355X platform demonstrates AMD’s commitment to delivering high-efficiency AI solutions that are both powerful and cost-effective.”

— AMD spokesperson

Qcbuegof SXM2 to PCIe X16 Adapter Metal Converter Card for V100 SXM2 GPU with Automatic Fan Control Metal GPU Adapter for Servers

Qcbuegof SXM2 to PCIe X16 Adapter Metal Converter Card for V100 SXM2 GPU with Automatic Fan Control Metal GPU Adapter for Servers

Perfect for deploying deeply learning workloads, rendering farms, or HPC clusters, the converter card transforms standard servers into…

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Details of Benchmark Methodology and Hardware Configuration

It is not yet clear what specific hardware configurations, software optimizations, or benchmarking tools were used to achieve these results. AMD has not officially confirmed the detailed testing parameters, and independent verification is ongoing. The performance figures could vary depending on the specific setup and workload conditions.

LLM INFERENCE ENGINEERING: Optimizing Large Language Models on NVIDIA GPUs

LLM INFERENCE ENGINEERING: Optimizing Large Language Models on NVIDIA GPUs

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Further Validation and Market Adoption of AMD AI Hardware

Expect AMD to publish detailed technical data and validation reports in the coming weeks. Industry analysts will closely monitor whether other testing groups can replicate these results and whether AMD’s solutions gain broader adoption in data centers. Additionally, competitors might respond with new hardware or pricing strategies to maintain market share.

PNY VCNRTXPRO4500B-PB NVIDIA RTX PRO 4500 Blackwell 32GB GDDR7 256B Generation Graphics Card - Black

PNY VCNRTXPRO4500B-PB NVIDIA RTX PRO 4500 Blackwell 32GB GDDR7 256B Generation Graphics Card – Black

10,496 CUDA Cores

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

How does GLM5.2 on AMD MI355X compare to NVIDIA Blackwell in real-world applications?

While the benchmark shows promising performance at a lower cost, real-world results depend on workload specifics, software compatibility, and deployment scale. Further testing is needed to confirm practical advantages.

Has AMD officially confirmed these benchmark results?

No, AMD has not yet officially confirmed the detailed testing methodology or results. The figures come from an independent third-party benchmark.

What implications does this have for organizations currently using NVIDIA hardware?

If validated, AMD’s offerings could provide a more cost-effective alternative for large-scale AI deployment, potentially influencing procurement decisions and competitive dynamics.

When will AMD release more detailed technical information about these results?

AMD is expected to publish detailed validation reports and technical data in the coming weeks, which will clarify the hardware and software configurations used.

Source: hn

You May Also Like

Signal’s Meredith Whittaker wants you to remember that AI chatbots ‘are not your friends’

Signal’s Meredith Whittaker emphasizes that AI chatbots are not sentient and warns about privacy risks, urging users to be cautious about AI systems’ access.

The Neocloud Cartel: How the AI Industry Started Renting Compute From Itself

Exploring how AI companies now rent compute from each other, forming a cartel centered around Nvidia and a handful of firms, reshaping industry power.

Data-Driven Variational Basis Learning Beyond Neural Networks: A Non-Neural Framework for Adaptive Basis Discovery

Researchers introduce DVBL, a non-neural method to learn basis functions directly from data, offering interpretability and rigorous analysis advantages.

New Kent State Workshop Aims to Demystify Artificial Intelligence.

Aiming to demystify artificial intelligence, Kent State’s new workshop offers insights that could transform your understanding—discover how inside.