One of the main challenges in training Large Language Models (LLMs) is that they are often too large to fit on a single node or even if they fit, the training may be too slow. To address this issue, their training can be parallelized across multiple Gaudi accelerators (HPUs).
If you want to train a large model using Megatron-DeepSpeed, but the model you want is not included in the implementation, you can port it to the Megatron-DeepSpeed package. Assuming your model is transformer-based, you can add your implementation easily, basing it on existing code.
We’re excited to participate in this year’s ISC High Performance Compute 2023 event in Hamburg Germany. This year our team will demonstrate the capabilities of our Habana Gaudi2® processors, which deliver high-performance, high-efficiency deep learning training and inference.