Nvidia is introducing a brand new top-of-the-line chip for AI work, the HGX H200. The brand new GPU upgrades the wildly in demand H100 with 1.4x extra reminiscence bandwidth and 1.8x extra reminiscence capability, enhancing its potential to deal with intensive generative AI work.
The large query is whether or not corporations will have the ability to get their palms on the brand new chips or whether or not they’ll be as provide constrained because the H100 — and Nvidia doesn’t fairly have a solution for that. The primary H200 chips might be launched within the second quarter of 2024, and Nvidia says it’s working with “international system producers and cloud service suppliers” to make them accessible.
The H200 seems to be considerably the identical because the H100 exterior of its reminiscence. However the adjustments to its reminiscence make for a significant improve. The brand new GPU is the primary to make use of a brand new, sooner reminiscence spec referred to as HBM3e. That brings the GPU’s reminiscence bandwidth to 4.8 terabytes per second, up from 3.35 terabytes per second on the H100, and its complete reminiscence capability to 141GB up from the 80GB of its predecessor.
“The combination of sooner and extra in depth HBM reminiscence serves to speed up efficiency throughout computationally demanding duties together with generative AI fashions and [high-performance computing] functions whereas optimizing GPU utilization and effectivity,” Ian Buck, Nvidia’s VP of high-performance computing merchandise, mentioned in a video presentation this morning.
The H200 can also be constructed to be suitable with the identical methods that already help H100s. Nvidia says cloud suppliers received’t must make any adjustments as they add H200s into the combo. The cloud arms of Amazon, Google, Microsoft, and Oracle might be among the many first to supply the brand new GPUs subsequent 12 months.
As soon as they launch, the brand new chips are certain to be costly. Nvidia doesn’t listing how a lot they price, however CNBC reports that the prior-generation H100s are estimated to promote for anyplace between $25,000 to $40,000 every, with 1000’s of them wanted to function on the highest ranges. The Verge has reached out to Nvidia for extra particulars on pricing and availability of the brand new chips.
Nvidia’s announcement comes as AI corporations stay desperately on the hunt for its H100 chips. Nvidia’s chips are seen as the best choice for effectively processing the massive portions of knowledge wanted to coach and function generative picture instruments and enormous language fashions. The chips are beneficial sufficient that companies are using them as collateral for loans. Who has H100s is the subject of Silicon Valley gossip, and startups have been working together simply to share any entry to them in any respect.
Subsequent 12 months is shaping as much as be a extra auspicious time for GPU consumers. In August, the Financial Times reported that Nvidia was planning to triple its manufacturing of the H100 in 2024. The objective was to provide as much as 2 million of them subsequent 12 months, up from round 500,000 in 2023. However with generative AI simply as explosive in the present day because it was initially of the 12 months, the demand might solely be better — and that’s earlier than Nvidia threw an excellent hotter new chip within the combine.