Home » Get Started » Get Optimized

Get Optimized

Model Optimization and Debugging​

Profiling and optimization are important to transform your model from functional to being optimized and performant. To achieve this, follow the steps outlined below and refer to the optimization guide documentation.​

To get started, follow the Performance Optimization Guide Checklist and review the summary below. 

​The optimization of a model will fall into three main categories​

  1.  Initial Model porting – Ensures the model is functional on Intel Gaudi processors by running GPU migration. Follow the Getting Started for Training or Inference.​
  2.  Model Optimizations – Includes the general enhancements for performance and applies to most models. This includes managing Dynamic Shapes, using HPU_graphs for Training or Inference​
  3.  Profiling – Allows you to identify bottlenecks on the host CPU or Intel Gaudi. It is recommended to follow the steps below, by first using Tensorboard with the Intel Gaudi Plug-in to identify specific items ​
First Step: PyTorch Profiling with TensorBoardObtains Gaudi-Specific recommendations for performance using TensorBoardProfiling with Pytorch
Review the PT_HPU_METRICS_FILELooks for excessive re-compilations during runtime.Setting HPU Metrics Review
Intel Gaudi Profiling Trace ViewerUses the Intel Gaudi specific Perfetto trace viewer for in-depth analysis of CPU and Intel Gaudi activity.Getting started with Perfetto Profiler
Model LoggingSets ENABLE_CONSOLE to set Logging for debug and analysis.Runtime Environment Variables


Maximizing Model Performance

Stay Informed: Register for the latest Intel Gaudi AI Accelerator developer news, events, training, and updates.