Get Optimized

Model Optimization and Debugging

Profiling and optimization are important to transform your model from functional to being optimized and performant. To achieve this, follow the steps outlined below and refer to the optimization guide documentation.

To get started, follow the Performance Optimization Guide Checklist and review the summary below.

The optimization of a model will fall into three main categories

Initial Model porting – Ensures the model is functional on Intel Gaudi processors by running GPU migration. Follow the Getting Started for Training or Inference.
Model Optimizations – Includes the general enhancements for performance and applies to most models. This includes managing Dynamic Shapes, using HPU_graphs for Training or Inference
Profiling – Allows you to identify bottlenecks on the host CPU or Intel Gaudi. It is recommended to follow the steps below, by first using Tensorboard with the Intel Gaudi Plug-in to identify specific items

Activity	Details	Link
First Step: PyTorch Profiling with TensorBoard	Obtains Gaudi-Specific recommendations for performance using TensorBoard	Profiling with Pytorch
Review the PT_HPU_METRICS_FILE	Looks for excessive re-compilations during runtime.	Setting HPU Metrics Review
Intel Gaudi Profiling Trace Viewer	Uses the Intel Gaudi specific Perfetto trace viewer for in-depth analysis of CPU and Intel Gaudi activity.	Getting started with Perfetto Profiler
Model Logging	Sets ENABLE_CONSOLE to set Logging for debug and analysis.	Runtime Environment Variables

Tutorial

Profiling and Optimization

Videos

Maximizing Model Performance

Model Optimization and Debugging

Links

Intel Gaudi Performance Optimization Guide

Model Optimization Checklist

Managing Dynamic Shapes

Tutorial

Profiling and Optimization

Videos

Model Optimization and Debugging​

Links

Intel Gaudi Performance Optimization Guide

Model Optimization Checklist

Managing Dynamic Shapes

Tutorial

Profiling and Optimization

Videos

Model Optimization and Debugging