For PyTorch profiling, we have added Gaudi-specific debug information in TensorBoard. See Profiling with PyTorch for more information.
Habana’s fork of DeepSpeed now includes ZeRO-Infinity support with ZeRO-3. In addition, DeepSpeed activation checkpointing is validated with cpu_checkpointing and synchronize_checkpoint_boundary flags See Habana DeepSpeed User Guide.
Most of Habana’s reference models are now migrated to autocast. Support for Habana Mixed Precision (HMP) will be dropped in a subsequent release.
Habana now provides a detect_recompilation tool to automatically detect dynamic inputs and dynamic ops in a model. See Handling Dynamic Shapes.
Several reference models have been updated with instructions to train and infer on Habana Gaudi2 and Gaudi first-gen accelerators. We have enhanced the performance of many models with this release. Check out Habana’s model performance page.
You can find more information on SynapseAI 1.10.0 on Habana’s release notes page.