Intel® Gaudi® AI Accelerators Blog

/ Developer Blog
With the Intel Gaudi SynapseAI 1.13.0 release, users can run Fine Tune the Llama2 70B model using only 8 Gaudi2 Accelerators.
Bringing forth numerous enhancements and updates for an improved user experience.
One of the main challenges in training Large Language Models (LLMs) is that they are often too large to fit on a single node or even if they fit, the training may be too slow. To address this issue, their training can be parallelized across multiple Gaudi accelerators (HPUs).
If you want to train a large model using Megatron-DeepSpeed, but the model you want is not included in the implementation, you can port it to the Megatron-DeepSpeed package. Assuming your model is transformer-based, you can add your implementation easily, basing it on existing code.
We have optimized additional Large Language Models on Hugging Face using the Optimum Habana library.
In this release, we’ve upgraded versions of several libraries, including DeepSpeed 0.9.4, PyTorch Lightning 2.0.4 and TensorFlow 2.12.1.
We are excited to see Meta release Llama 2, to help further democratize access to LLMs. Making such models more widely available will facilitate efforts across the AI community to benefit the world at large.
MLCommons published results of its industry AI performance benchmark, MLPerf Training 3.0, in which both the Habana® Gaudi®2 deep learning accelerator and the 4th Gen Intel® Xeon® Scalable processor delivered impressive training results.
Habana Labs, an Intel company, and Genesis Cloud are collaborating to deliver a new class of cloud instances with Habana® Gaudi®2 accelerators to enable high-performance, high-efficiency deep learning training and inference workloads in the cloud.
In the 1.10 release, we’ve upgraded versions of several libraries, including PyTorch 2.0.1, PyTorch Lightning 2.0.0 and TensorFlow 2.12.0. We have added support for EKS 1.25 and OpenShift 4.12
We’re excited to participate in this year’s ISC High Performance Compute 2023 event in Hamburg Germany. This year our team will demonstrate the capabilities of our Habana Gaudi2® processors, which deliver high-performance, high-efficiency deep learning training and inference.
Equus and Habana have teamed up to simplify the process of testing, implementing and deploying AI infrastructure based on Habana Gaudi2 processors.
In training workloads, there may occur some scenarios in which graph re-compilations occur. This can create system latency and slow down the overall training process with multiple iterations of graph compilation. This blog focuses on detecting these graph re-compilations.
Announcing a new End-to-End use case showing Training of a semantic segmentation model for Autonomous Driving
In the 1.9 release, we’ve upgraded versions of several libraries, including PyTorch Lightning 1.9.4, DeepSpeed 0.7.7, fairseq 0.12.3, and Horovod v0.27.0.
In this article, you'll learn how to easily deploy multi-billion parameter language models on Habana Gaudi2 and get a view into the Hugging Face performance evaluation of Gaudi2 and A100 on BLOOMZ.
AWS and Habana collaborated to enable EFA Peer Direct support on the Gaudi-based AWS DL1 instances, offering users significant improvement in multi-instance model training performance.
AI is becoming increasingly important for retail use cases.  It can provide retailers with advanced capabilities to personalize customer experiences, optimize operations, and increase sales.  Habana has published a new Retail use case showing an ...
In this article, you will learn how to use Habana® Gaudi®2 to accelerate model training and inference, and train bigger models with 🤗 Optimum Habana.
With Habana’s SynapseAI 1.8.0 release support of DeepSpeed Inference, users can run inference on large language models, including BLOOM 176B.
Our blog today features a Riken white paper, initially prepared and published by the Intel Japan team in collaboration with Kei Taneishi, research scientist with Riken’s Institute of Physical and Chemical Research. […]
We have upgraded versions of several libraries with SynapseAI 1.8.0, including PyTorch 1.13.1, PyTorch Lightning 1.8.6 and TensorFlow 2.11.0 & 2.8.4.
In this paper we’ll show how Transfer Learning is an efficient way to train an existing model on a new and unique dataset with equivalent accuracy and significantly less training time.
In this post, we show you how to run Habana’s DeepSpeed enabled BERT1.5B model from our Model-References repository.
12 Next
Stay Informed: Register for the latest Intel Gaudi AI Accelerator developer news, events, training, and updates.