(Credit: Michael Kan)
To improve chatbot performance, Nvidia plans to sell a new kind of processor, an LPU, optimized to run large language models (LLMs).
The “Nvidia Groq 3 LPU” chip was among seven upcoming chips Nvidia touted at the company’s annual GTC event, where it pitched the AI industry on why Nvidia’s chips continue to lead.
The LPU, or Language Processing Unit, comes from Nvidia's deal this past December to license technology from a California AI company called Groq (not to be confused with the AI chatbot Grok from xAI). Founded in 2016, Groq issued earlier LPU chips specifically designed for LLMs to offer faster speeds and energy efficiency. The aim: To create an alternative to Nvidia’s enterprise GPUs, which can be used for a wider range of AI workloads.
Nvidia now wants to pair the newly revealed Groq 3 LPU with the rest of the company’s next-generation AI chips, dubbed the “Vera Rubin” platform, which includes the upcoming Rubin GPU and Vera CPU tech for data centers.
(Credit: Michael Kan)Groq’s LPU chips use even faster SRAM (static RAM), instead of HBM (high-bandwidth memory) typically found on Nvidia’s GPUs. But on the downside, Groq’s LPUs can only offer “hundreds of megabytes” in SRAM, whereas HBM memory can span over a hundred gigabytes or more per chip.
That’s why a single Groq 3 LPU only contains 500MB of SRAM, while Nvidia’s upcoming Rubin GPU will feature 288GB of HBM4 memory. To compensate for the lower memory capacity, Nvidia is preparing to sell large batches of LPUs to work alongside the rest of its data center chips, giving AI companies a way to squeeze out even more performance.
Nvidia noted “the LPX rack with 256 LPU processors features 128GB of on-chip SRAM and 640TB/s of scale-up bandwidth. Deployed with Vera Rubin NVL72 (server unit), Rubin GPUs and LPUs boost decode by jointly computing every layer of the AI model for every output Token.”
(Credit: Michael Kan)A data center could thus harness both the LPUs and Nvidia's GPUs, dividing AI workloads between them to increase efficiency. Nvidia's CEO, Jensen Huang, said the combined approach excels at helping AI companies boost performance with longer prompts.
(Credit: Michael Kan)Combined, the LPUs and Rubin GPUs also promise to deliver up to a 35x increase in throughput when running a large language model with 1 trillion parameters, according to Nvidia's benchmarks.
"We're in production with the Groq chip," Huang said, adding that it'll likely ship in Q3. Nvidia has contracted Samsung to manufacture the LPU. One analyst already expects Nvidia to ship out 4 to 5 million LPUs through 2026 and 2027.
(Credit: Michael Kan)The new LPU and Vera Rubin systems will likely cost tens of thousands of dollars per chip, putting them far out of reach of consumers. Instead, expect the biggest AI companies, including OpenAI, Anthropic, and Meta, to adopt these technologies, which could power your chatbot queries or image-generation requests in the near future.
At GTC, Nvidia also talked up Vera Rubin, about which the company has gone into detail before, including at January’s CES, where the company revealed the Rubin chips were in “full production.” Nvidia plans on shipping the Vera Rubin-related chips, including the new LPU chip, in this year’s second half.


