As we see models getting larger and larger, there is a need to enable libraries and techniques to help reduce the memory size to ensure models will fit into device memory. In this webinar, you’ll learn about the basic steps needed to enable DeepSpeed on Gaudi, and show how the ZeRO1 and ZeRO2 memory optimizers and Activation Checkpointing can be used to reduce memory usage on a large model. We’ll also show you how to use Gaudi’s APIs to detect the peak memory of any model and provide guidance on when to use these techniques. This will include a live Q&A to follow.
Related links and resources:
- Documentation and DeepSpeed User Guide
- System Access:
- AWS DL1 Instance: https://aws.amazon.com/ec2/instance-types/dl1/
- Intel Developer Cloud: https://developer.habana.ai/intel-developer-cloud/
- Model Repository
- For additional support, users can go to Habana’s User Forum to post questions and chat with other users