Nvidia president and CEO Jensen Huang speaks on the COMPUTEX discussion board in Taiwan. “Everyone is a programmer. Now, you just have to say something to the computer.” (Photo by Walid Berrazeg/SOPA Images/LightRocket through Getty Images)
Sopa Images | Lightrocket | Getty Images
Nvidia introduced a brand new chip designed to run synthetic intelligence fashions on Tuesday because it seeks to fend off rivals within the AI {hardware} house, together with AMD, Google and Amazon.
Currently, Nvidia dominates the marketplace for AI chips with over 80% market share, in keeping with some estimates. The firm’s specialty is graphics processing models, or GPUs, which have grow to be the popular chips for the massive AI fashions that underpin generative AI software program, equivalent to Google’s Bard and OpenAI’s ChatGPT. But Nvidia’s chips are in brief provide as tech giants, cloud suppliers and startups vie for GPU capability to develop their very own AI fashions.
Nvidia’s new chip, the GH200, has the identical GPU as the corporate’s present highest-end AI chip, the H100. But the GH200 pairs that GPU with 141 gigabytes of cutting-edge reminiscence, in addition to a 72-core ARM central processor.
“We’re giving this processor a boost,” Nvidia CEO Jensen Huang stated in a chat at a convention on Tuesday. He added, “This processor is designed for the scale-out of the world’s data centers.”
The new chip shall be out there from Nvidia’s distributors within the second quarter of subsequent 12 months, Huang stated, and must be out there for sampling by the top of the 12 months. Nvidia representatives declined to provide a worth.
Oftentimes, the method of working with AI fashions is break up into not less than two elements: coaching and inference.
First, a mannequin is educated utilizing giant quantities of knowledge, a course of that may take months and typically requires 1000’s of GPUs, equivalent to, in Nvidia’s case, its H100 and A100 chips. Then the mannequin is utilized in software program to make predictions or generate content material, utilizing a course of known as inference. Like coaching, inference is computationally costly, and it requires lots of processing energy each time the software program runs, like when it really works to generate a textual content or picture. But not like coaching, inference takes place near-constantly, whereas coaching is simply required when the mannequin wants updating.
“You can take pretty much any large language model you want and put it in this and it will inference like crazy,” Huang stated. “The inference cost of large language models will drop significantly.”
Nvidia’s new GH200 is designed for inference because it has extra reminiscence capability, permitting bigger AI fashions to suit on a single system, Nvidia VP Ian Buck stated on a name with analysts and reporters on Tuesday. Nvidia’s H100 has 80GB of reminiscence, versus 141GB on the brand new GH200. Nvidia additionally introduced a system that mixes two GH200 chips right into a single pc for even bigger fashions.
“Having larger memory allows the model to remain resident on a single GPU and not have to require multiple systems or multiple GPUs in order to run,” Buck stated.
The announcement comes as Nvidia’s main GPU rival, AMD, lately introduced its personal AI-oriented chip, the MI300X, which might assist 192GB of reminiscence and is being marketed for its capability for AI inference. Companies together with Google and Amazon are additionally designing their very own customized AI chips for inference.
Source: www.cnbc.com