This tutorial provides example training scripts to demonstrate different DeepSpeed optimization technologies on HPU. This tutorial will focus on the memory optimization technologies, including Zero Redundancy Optimizer(ZeRO) and Activation Checkpointing. Example Overview The PyTorch minGPT ... Read more
The SDSC Voyager supercomputer is an innovative AI system designed specifically for science and engineering research at scale. Funded by the National Science Foundation, Voyager represents a collaboration with the San Diego Supercomputer Center at UC San Diego, Supermicro, ... Read more
In training workloads, there may occur some scenarios in which graph re-compilations occur. This can create system latency and slow down the overall training process with multiple iterations of graph compilation. This blog focuses on ... Read more
In this tutorial, we will demonstrate fine tuning a GPT2 model on Habana Gaudi AI processors using Hugging Face optimum-habana library with DeepSpeed. What is fine tuning? Training models from scratch can be expensive, especially ... Read more
One of the key challenges in Large Language Model (LLM) training is reducing the memory requirements needed for training without sacrificing compute/communication efficiency and model accuracy. DeepSpeed [2] is a popular deep learning software library ... Read more