Intel introduces Gaudi 3 processors as part of its AI strategy

On Intel Vision 2024Intel had a lot to say AI and what was being worked on in that area. The company announced a new AI accelerator called Gaudi 3its plans to collaborate on an open platform for enterprise AI and next-generation processors.

Gaudi 3 uses Ethernet to connect tens of thousands of accelerators, which the company believes will enable “a significant leap in artificial intelligence training and inference for global enterprises looking to deploy GenAI at scale.”

Each accelerator can perform 64,000 operations in parallel, supporting the computational complexity required by deep learning algorithms. Its memory capacity is 128 GB and it also has 3.7 TB of memory bandwidth and 96 MB of available built-in static RAM. According to Intel, these memory specifications enable LLMs and multimodal models to be efficiently served.

The software Gaudi is working on integrates with PyTorch and provides optimized models from Hugging Face, which the company says makes it easier to port models to different types of hardware.

Gaudi 3 also introduces an express (PCIe) peripheral interconnect card that is useful for workloads such as fine-tuning, inference, and extended fetch generation.

Compared to its competitor Nvidia H100, Intel expects Gaudi 3 to be 50% faster for training over Llama2 with 7B and 13B parameters and GPT-3 with 175B parameters. It’s also expected to have 50% more bandwidth overall and 40% more power efficiency, compared to Nvidia’s.

Intel expects Gaudi 3 to be available to vendors including Dell Technologies, HPE, Lenovo and Supermicro in the second quarter of this year.

“In the ever-evolving AI market, there remains a significant gap in current offerings,” said Justin Hotard, executive vice president and general manager of the Data Center and AI Group at Intel. “Feedback from our customers and the wider market emphasizes the desire for more choice. Businesses consider factors such as availability, scalability, performance, cost and energy efficiency. Intel Gaudi 3 stands out as a GenAI alternative by presenting a compelling combination of price performance, system scalability and time-to-value advantages.”

Along with the Gaudi 3 announcement, the company also announced that it is working with a number of companies to create an open platform for enterprise AI.

To support this effort, Intel will release reference implementations for GenAI pipelines for Intel Xeon and Gaudi-based systems, release a technical conceptual framework, and add more infrastructure capacity to the Intel Tiber Developer Cloud.

Other companies working together on this project include Anyscale, Articul8, DataStax, Domino, Hugging Face, KX Systems, MariaDB, MinIO, Qdrant, RedHat, Redis, SAP, VMware, Yellowbrick and Zilliz.

And finally, the company announced the next generation of its Intel Xeon processors. The new Intel Xeon 6 processors include Efficient-core (E-cores) and Performance-core (P-cores). E-cores offer a 4x performance improvement and 2.7x better rack density than 2nd generation Intel Xeon processors. P-cores add support for the MXFP4 data format, reducing token latency by 6.5x compared to 4th generation Intel Xeon processors.

According to Intel, Xeon 6 processors with E-cores will be launched this quarter, and processors with P-cores will be launched after that.

The company also announced that the next generation of Intel Ultra processors will be launched later this year and will have more than 100 platform tera operations per second (TOPS) and more than 45 neural processing units TOPS.

“Innovation is advancing at an unprecedented rate, all enabled by silicon – and every company is quickly becoming an AI company,” said Pat Gelsinger, CEO of Intel. “Intel is bringing AI everywhere to the enterprise, from the PC to the data center to the edge. Our latest Gaudi, Xeon and Core Ultra platforms deliver a cohesive set of flexible solutions tailored to meet the changing needs of our customers and partners and take advantage of the enormous opportunities ahead.”

Source link