We have added support for RHEL8 on Gaudi2; support for TensorFlow 2.8.4 is deprecated and will not be available starting this release.
Habana now provides a GPU Migration toolkit, which simplifies migrating PyTorch models that contain Python API calls with dependencies on GPU libraries (for e.g. torch.cuda calls). See GPU Migration Toolkit. We have enabled native PyTorch autocast support for Gaudi and have demonstrated the same on several reference models. For users interested in tracking memory usage, we have added monitoring HPU memory utilization metrics during training on TensorBoard.
Habana’s fork of DeepSpeed now includes support for ZeRO-3 as well as ZeRO-Offload with ZeRO-1 or ZeRO-2 for training. In addition, DeepSpeed activation checkpointing is validated with activation partitioning and contiguous memory optimization flags. See Habana DeepSpeed User Guide.
Several reference models have been updated with instructions to train and infer on Habana Gaudi2 and Gaudi first-gen; for training, this includes stable diffusion scaling up to 64 cards and HuBERT, and for inference, BLOOM-176B with beam search, stable diffusion v2.1, UNet2D, UNet3D and ResNext101. We have enhanced the performance of many models with this release. Check out Habana’s model performance page.
You can find more information on SynapseAI 1.9.0 on Habana’s release notes page.