These performance numbers are measured using the latest SynapseAI SW release, version 1.14.0-493, unless otherwise noted.
All Models for both Training and Inference are using the PyTorch 2.1.1 Framework. Other applicable frameworks used for training or inference are noted for each model.
These performance numbers have been generated with the latest version of SynapseAI and are improvements over the officially submitted numbers posted on MLCommons website.
Model
# HPU
Precision
Time To Train
Framework Version
MLPerf 3.1 - GPT3
384
fp8
153.58 min*
MLPerf 3.1 - GPT3
256
fp8
223.75 min**
MLPerf 3.1 - Stable Diffusion v2
64
bf16
19.4 min**
PyTorch Lightning 2.1.2
MLPerf 3.1 - ResNet
8
bf16
16.22 min
MLPerf 3.1 - BERT
8
bf16
14.25 min
* The GPT3 measurement with 384 cards was taken using a pre-launch version of the SynapseAI 1.13.0 Software stack ** The GPT measurement with 256 cards and Stable Diffusion were taken using the SynapseAI 1.13.0 Software stack
Gaudi2 Large Language Models Training Performance
Model
# HPU
Precision
Throughput
Sequence Length
TP, PP, DP
Batch Size
Framework Version
LLaMA 13B
64
bf16
68.12 samples/sec
2,048
2, 2, 16
256
DeepSpeed 0.12.4
LLaMA 2 70B
256
bf16
30 samples/sec
4,096
8, 8, 4
1,024
DeepSpeed 0.12.4
LLaMA 2 70B
512
bf16
55.4 samples/sec
4,096
8, 8, 8
2,048
DeepSpeed 0.12.4
LLaMA 2 70B
1,024
bf16
104.4 samples/sec
4,096
8, 8, 16
4,096
DeepSpeed 0.12.4
Bloom-13B
64
bf16
72.5 samples/sec
2,048
2, 2, 16
1,024
DeepSpeed 0.12.4
TP, PP, DP = These are the Tensor Parallel, Pipeline Parallel and Data Parallel parameters for the Megatron DeepSpeed training
Gaudi2 Reference Models Training Performance
Model
# HPU
Precision
Throughput
Accuracy
Time To Train
Batch Size
Framework Version
DeepSpeed Chat LLaMA 7B Step1
8
bf16
870 sec/iter
ppl: 1.61
8
Megatron DeepSpeed 0.12.4
DeepSpeed Chat LLaMA 7B Step2
8
bf16
770 sec/iter
acc: 81
4
Megatron DeepSpeed 0.12.4
DeepSpeed Chat LLaMA 7B Step3
8
bf16
7.8 sec/iter
ema: 2.7
4
Megatron DeepSpeed 0.12.4
Stable Diffusion
64
bf16
10658.26 img/sec
32
Lightning 2.1.2
Stable Diffusion Fine Tuning
1
bf16
70 img/sec
7
Lightning 2.1.2
Stable Diffusion Fine Tuning Textual Inversion
1
bf16
20.58 img/sec
7
Lightning 2.1.2
ResNet50 LARS
32
bf16
185594.14 img/sec
76.33
6.62 min
256
ResNet50 LARS
8
bf16
47104.44 img/sec
76.24
17.53 min
256
ResNet50 LARS
1
bf16
6072.03 img/sec
256
BERT Pre Training Phase 1
32
bf16
32249.28 sent/sec
1.497506499
273.25 min
64
BERT Pre Training Phase 1
8
bf16
1151.95 sent/sec
64
BERT Pre Training Phase 1
1
bf16
9126.16 sent/sec
64
BERT Pre Training Phase 2
32
bf16
10812.96 sent/sec
1.332446098
91.21 min
16
BERT Pre Training Phase 2
8
bf16
348.71 sent/sec
16
BERT Pre Training Phase 2
1
bf16
2779.63 sent/sec
16
BERT SQUAD Fine Tuning
8
bf16
2054.6 sent/sec
90.83
5.12 min
24
BERT SQUAD Fine Tuning
1
bf16
281.49 sent/sec
24
ResNext101
8
bf16
22146.5 img/sec
78.03
102 min
256
ResNext101
1
bf16
2841.31 img/sec
256
SSD
8
bf16
16555.1 img/sec
22.95
9.58 min
128
SSD
1
bf16
2098.21 img/sec
128
Transformer
8
bf16
1074849 token/sec
27.9
242.05 min
8192
Transformer
1
bf16
133257.66 token/sec
8192
Unet2D
8
bf16
16999.22 img/sec
72.61
13.92 min
64
Lightning 2.1.2
Unet2D
1
bf16
2697.3 img/sec
64
Lightning 2.1.2
Unet3D
8
bf16
256.76 img/sec
74.26
19.77 min
2
Lightning 2.1.2
Unet3D
1
bf16
32.69 img/sec
2
Lightning 2.1.2
Hugging Face Optimum Habana Gaudi2 Training Performance
See the Examples page for information on how to run each of the Tasks, including model naming and hyperparameter usage.
Model
# HPU
Precision
Throughput
Accuracy
Time To Train
Batch Size
Task
Framework Version
Llama2-70B Fine Tuning (LoRA)
8
bf16
2.43 sentences/sec
2.12
43.21 min
10
language-modeling
Optimum Habana 1.9.0
Llama1-7B Fine Tuning (LoRA)
8
bf16
143.02 sentences/sec
0.93
5.5 min
64
language-modeling
Optimum Habana 1.9.0
Falcon-180B Fine Tuning (LoRA)
8
bf16
1.53 sentences/sec
1.05
270.13 min
1
language-modeling
Optimum Habana 1.9.0
Falcon-40B Fine Tuning (LoRA)
8
bf16
28.83 sentences/sec
1.19
16.8 min
1
language-modeling
Optimum Habana 1.9.0
GPTJ-CLM
8
bf16
16.64 sentences/sec
0.53
14.13 min
4
language-modeling
Optimum Habana 1.9.0
GPTNEOX-20B-CLM
8
bf16
290.8 sentences/sec
0.17
28.75 min
2
language-modeling
Optimum Habana 1.9.0
BridgeTower
8
bf16
473.05 sentences/sec
7.4 min
40
contrastive-image-text
Optimum Habana 1.9.0
GPT2
8
bf16
575.41 sentences/sec
4
language-modeling
Optimum Habana 1.9.0
GPT2-XL
8
bf16
86.59 sentences/sec
4
language-modeling
Optimum Habana 1.9.0
ALBERT-Large
8
bf16
2280.45 sentences/sec
91.84
2.08 min
32
question-answering
Optimum Habana 1.9.0
ALBERT-XXL
8
bf16
446.96 sentences/sec
94.89
7.01 min
12
question-answering
Optimum Habana 1.9.0
BERT Base
8
bf16
3103 sentences/sec
85.42
1.2 min
24
question-answering
Optimum Habana 1.9.0
BERT-Large Fine Tuning
8
bf16
2256.81 sentences/sec
93.29
1.91 min
24
question-answering
Optimum Habana 1.9.0
ClipRoBERTa
8
bf16
1160.11 images/sec
27.86 min
64
contrastive-image-text
Optimum Habana 1.9.0
DistilBERT
8
bf16
9985.41 sentences/sec
82.5
0.56 min
8
question-answering
Optimum Habana 1.9.0
Flan-T5 XXL
8
bf16
28.59 sentences/sec
36.64
387.47 min
22
question-answering
Optimum Habana 1.9.0
RoBERTa Base
8
bf16
6563.31 sentences/sec
92.15
0.75 min
12
question-answering
Optimum Habana 1.9.0
RoBERTa Large
8
bf16
2234.79 sentences/sec
94.52
1.93 min
12
question-answering
Optimum Habana 1.9.0
Swin Transformer
8
bf16
5753.05 images/sec
64
image-classification
Optimum Habana 1.9.0
T5-LARGE
8
bf16
84.77 sentences/sec
44.28
226.83 min
4
summarization
Optimum Habana 1.9.0
T5-Small
8
bf16
576.91 sentences/sec
26.09
105.8 min
4
translation
Optimum Habana 1.9.0
Vision Transformer
8
bf16
6004.6 images/sec
98.91
128
image-classification
Optimum Habana 1.9.0
Wav2Vec2.0 AC
8
bf16
1997.84 sentences/sec
81.32
2.36 min
16
speech-recognition
Optimum Habana 1.9.0
Wav2Vec2.0 AC
8
bf16
56.18 sentences/sec
3.88
25.4 min
4
speech-recognition
Optimum Habana 1.9.0
MosaicML Gaudi2 Training Performance
Framework Version
Model
# HPU
Precision
Throughput
Accuracy
Time To Train
Batch Size
PyTorch 2.1.1
MosaicML MPT-1B
8
bf16
24478 samples/sec
6.99
16.6 min
512
PyTorch 2.1.1
MosaicML MPT-70B
32
bf16
13940 samples/sec
7.49
138.4 min
512
Gaudi Reference Models Training Performance
Model
# HPU
Precision
Throughput
Accuracy
Time To Train
Batch Size
Framework Version
ResNet50 Keras LARS
32
bf16
48176.47 img/sec
75.86
19.66 min
256
ResNet50 Keras LARS
8
bf16
12305.57 img/sec
76.16
69.86 min
256
ResNet50 Keras LARS
1
bf16
1624.15 img/sec
256
BERT Pre Training combine
32
bf16
4805.33 sent/sec
1803.26 min
64
BERT Pre Training combine
8
bf16
1221.73 sent/sec
64
BERT Pre Training combine
1
bf16
153.34 sent/sec
64
BERT Pre Training Phase 1
32
bf16
5757.69 sent/sec
Loss: 1.49
1348.41 min
64
BERT Pre Training Phase 1
8
bf16
1467.43 sent/sec
64
BERT Pre Training Phase 1
1
bf16
184.18 sent/sec
64
BERT Pre Training Phase 2
32
bf16
1911.84 sent/sec
Loss: 1.33
454.85 min
8
BERT Pre Training Phase 2
8
bf16
482.51 sent/sec
8
BERT Pre Training Phase 2
1
bf16
60.56 sent/sec
8
BERT SQUAD Fine Tuning
8
bf16
404.66 sent/sec
90.68
13.08 min
24
BERT SQUAD Fine Tuning
1
bf16
53 sent/sec
24
BART Fine Tuning
8
bf16
1763.9 sent/sec
32
DINO
8
bf16
937.41 exmpl/sec
77
2280.8 min
64
MobileNetV2
8
bf16
12049 img/sec
71.21
531.06 min
256
ResNet152
8
bf16
4985.28 img/sec
78.56
435.41 min
128
SSD**
8
bf16
3557.6 images/sec
128
Transformer
8
bf16
186126.33 tokens/sec
28.2
1035.21 min
4096
Unet2D
8
bf16
5133.72 img/sec
72.6
58.9 min
64
Lightning 2.1.2
Unet3D
8
bf16
61.53 img/sec
74.21
76.5 min
2
Lightning 2.1.2
YOLOX
8
bf16
380.08 img/sec
39.65
2104.73 min
16
ResNet50 Host NIC (libfabric)
16
bf16
22542.08 img/sec
256
Hugging Face Optimum Habana Gaudi Training Performance
See the Examples page for information on how to run each of the Tasks, including model naming and hyperparameter usage.
Model
# HPU
Precision
Throughput
Accuracy
Time To Train
Batch Size
Task
Framework Version
GPT2-XL
8
bf16
18.69 sentences/sec
0.47
77.1 min
4
language-modeling
DeepSpeed 0.12.4, Optimum Habana 1.9.0
GPT2
8
bf16
166.18 sentences/sec
0.41
4 min
4
language-modeling
DeepSpeed 0.12.4, Optimum Habana 1.9.0
T5-LARGE
8
bf16
40.14 sentences/sec
44.24
434.71 min
4
summarization
DeepSpeed 0.12.4, Optimum Habana 1.9.0
T5-Small
8
bf16
177.98 sentences/sec
26.1
182.06 min
4
translation
DeepSpeed 0.12.4, Optimum Habana 1.9.0
ALBERT-L
8
bf16
447.01 sentences/sec
92.81
8.43 min
32
question-answering
Optimum Habana 1.9.0
ALBERT-XXL
8
bf16
75.65 sentences/sec
94.93
41.5 min
12
question-answering
Optimum Habana 1.9.0
BERT-BASE
8
bf16
1200.38 sentences/sec
85.28
2.93 min
24
question-answering
Optimum Habana 1.9.0
BERT-Large FT
8
bf16
395.41 sentences/sec
93.18
8.91 min
24
question-answering
Optimum Habana 1.9.0
Clip-RoBERTa
8
bf16
916.37 images/sec
64
contrastive-image-text
Optimum Habana 1.9.0
DistilBERT
8
bf16
1559.41 sentences/sec
85.37
2.95 min
8
question-answering
Optimum Habana 1.9.0
RoBERTa Base
8
bf16
1063.78 sentences/sec
91.94
3.13 min
12
question-answering
Optimum Habana 1.9.0
RoBERTa Large
8
bf16
361.06 sentences/sec
94.46
9.18 min
12
question-answering
Optimum Habana 1.9.0
Swin Transformer
8
bf16
1565.2 images/sec
98.61
64
question-answering
Optimum Habana 1.9.0
Vision Transformer
8
bf16
2338.78 images/sec
97.39
64
question-answering
Optimum Habana 1.9.0
Wav2Vec2-AC
8
bf16
646.17 sentences/sec
80.52
6.43 min
16
speech-recognition
Optimum Habana 1.9.0
Wav2Vec2-ASR
8
bf16
34.19 sentences/sec
3.86
42.58 min
4
speech-recognition
Optimum Habana 1.9.0
Gaudi2 MLPerf™ 3.1 Inference Performance
Framework Version
Model
# HPU
Precision
Performance
PyTorch 2.1.1
MLPerf 3.1 - GPT-J Offline 99.9% Accuracy
8
fp8
83.4 samples/sec
PyTorch 2.1.1
MLPerf 3.1 - GPT-J Server 99.9% Accuracy
8
fp8
77.66 queries/sec
Gaudi2 Large Languages Models Inference Performance
Model
# HPU
Precision
Input length
Output Length
Max Token Sequence Length
Throughput
Latency***
Batch
Framework Version
Falcon-7B
1
bf16
100
8k
8k
110.7 token/sec
9.03 ms
1
Optimum Habana 1.9.0
Bloom-7B-Greedy
1
bf16
2k
721.56 token/sec
11.08 ms
8
Bloom-7B-Greedy
1
fp8
2K
194.12 token/sec
5.15 ms
1
GPT-J
8
bf16
6
100
100
562.23 token/sec
7.11 ms
4
DeepSpeed 0.12.4, Optimum Habana 1.9.0
LLaMA 2-7B
1
fp8
1K
3k
4k
1101.7 token/sec
10.89 ms
12
Optimum Habana 1.9.0
LLaMA 2-7B
1
fp8
2k
6k
8k
551.84 token/sec
10.87 ms
6
Optimum Habana 1.9.0
LLaMA 2-7B
1
fp8
4k
12k
16k
273.32 token/sec
10.97 ms
3
Optimum Habana 1.9.0
LLaMA 2-7B
1
bf16
1k
3k
4k
361.14 token/sec
11.07 ms
4
Optimum Habana 1.9.0
Falcon-40B
8
bf16
100
8k
8k
61.85 token/sec
16.16 ms
1
Optimum Habana 1.9.0
LLaMA 2-70B
8
fp8
2k
2k
4k
4910.4 token/sec
56.41 ms
277
DeepSpeed 0.12.4, Optimum Habana 1.9.0
LLaMA 2-70B
8
fp8
2k
6k
8k
2859.5 token/sec
26.92 ms
77
DeepSpeed 0.12.4, Optimum Habana 1.9.0
LLaMA 2-70B
8
fp8
2k
14k
16k
1470.6 token/sec
25.83 ms
38
DeepSpeed 0.12.4, Optimum Habana 1.9.0
LLaMA 2-70B
8
fp8
2k
30k
32k
775.4 token/sec
24.5 ms
19
DeepSpeed 0.12.4, Optimum Habana 1.9.0
LLaMA 2-70B
8
bf16
2k
2k
4k
3225.6 token/sec
66.96 ms
216
DeepSpeed 0.12.4, Optimum Habana 1.9.0
LLaMA 2-70B
8
bf16
2k
6k
8k
1229.1 token/sec
24.4 ms
30
DeepSpeed 0.12.4, Optimum Habana 1.9.0
LLaMA 2-70B
8
bf16
2k
14k
16k
566.9 token/sec
26.45 ms
15
DeepSpeed 0.12.4, Optimum Habana 1.9.0
LLaMA 2-13B
1
bf16
2k
2k
4k
125.66 token/sec
15.91 ms
2
Optimum Habana 1.9.0
Bloomz-176B
8
bf16
6
100
100
36.36 token/sec
27.5 ms
1
DeepSpeed 0.12.4, Optimum Habana 1.9.0
Bloom-176B-Greedy
8
fp8
4K
199.39 token/sec
40.12 ms
8
DeepSpeed 0.12.4
Bloom-176B-Greedy
8
bf16
4K
394.47 token/sec
53.23 ms
21
DeepSpeed 0.12.4
Bloom-176B-Greedy
8
bf16
8K
196.21 token/sec
50.96 ms
10
DeepSpeed 0.12.4
Bloom-176B-Greedy
8
bf16
16K
80.79 token/sec
49.51 ms
4
DeepSpeed 0.12.4
Bloom-176B-Greedy
8
bf16
32K
25.81 token/sec
38.74 ms
1
DeepSpeed 0.12.4
Bloom-176B-Sampling
8
bf16
1k
19.59 token/sec
51.04 ms
1
DeepSpeed 0.12.4
Bloom-176B-BeamSearch-8
8
bf16
512
30.57 token/sec
32.7 ms
1
DeepSpeed 0.12.4
Gaudi2 Reference Models Inference Performance
Model
# HPU
Precision
Throughput
Latency***
Batch
Framework Version
Stable Diffusion v2.1 (512x512)
1
bf16
1.2 img/sec
830.56 ms
1
Lightning 2.1.2
Stable Diffusion v2.1 (768X768)
1
bf16
0.45 img/sec
2217.29 ms
1
Lightning 2.1.2
Bert FT
1
bf16
807.84 token/sec
29.7 ms
24
Resnet50
1
bf16
16897.38 img/sec
15.15 ms
256
Resnext101
1
bf16
10355.48 img/sec
24.72 ms
256
Unet2D
1
bf16
8386.18 img/sec
7.63 ms
64
Lightning 2.1.2
Unet3D
1
bf16
112.85 img/sec
17.72 ms
2
Lightning 2.1.0
Hugging Face Optimum Habana Gaudi2 Inference Performance
See the Examples page for information on how to run each of the Tasks, including model naming and hyperparameter usage.
Model
# HPU
Precision
Max Token Sequence Length
Throughput
Latency
Batch
task
Framework Version
StableDiffusion v2.1 (512x512)
1
bf16
1.24 images/sec
3223.2 ms
4
stable-diffusion
PyTorch Lightning 2.1.2
OPT
1
bf16
980.99 token/sec
1.01 ms
1
text-generation
DeepSpeed 0.12.4, Optimum Habana 1.9.0
StarCoder
1
bf16
65.5 token/sec
15.26 ms
1
text-generation
DeepSpeed 0.12.4, Optimum Habana 1.9.0
MPT-7B
1
bf16
1932
105.38 token/sec
9.48 ms
1
text-generation
Optimum Habana 1.9.0
Bert (Text Classification)
1
bf16
186.26 token/sec
42.94 ms
8
text-classification
Optimum Habana 1.9.0
Bert (Language Modeling)
1
bf16
80.72 token/sec
49.55 ms
4
language-modeling
Optimum Habana 1.9.0
Bert (Question Answering)
1
bf16
599.54 token/sec
13.34 ms
8
question-answering
Optimum Habana 1.9.0
Bart
1
bf16
6.5 token/sec
614.91 ms
4
language-modeling
Optimum Habana 1.9.0
BridgeTower
1
bf16
329.65 token/sec
48.53 ms
16
constrastive-image-text
Optimum Habana 1.9.0
ESMFold
1
bf16
3.67 token/sec
272.47 ms
1
protein-folding
Optimum Habana 1.9.0
StableLM-3B
1
bf16
2048
232.34 token/sec
4.3 ms
1
text-generation
Optimum Habana 1.9.0
StableLM-7B
1
bf16
2048
123.02 token/sec
8.12 ms
1
text-generation
Optimum Habana 1.9.0
T5-3B Summarization 1024-128 Beam4
1
bf16
0.96 token/sec
1035.19 ms
1
summarization
Optimum Habana 1.9.0
T5-3B Summarization Greedy
1
bf16
2.37 token/sec
420.87 ms
1
summarization
Optimum Habana 1.9.0
HF-T5-Small-Translation-Greedy
1
bf16
1031.6 token/sec
3.87 ms
4
translation
Optimum Habana 1.9.0
Wav2vec(Audio Classification)
1
bf16
15.26 token/sec
262.12 ms
4
audio-classification
Optimum Habana 1.9.0
Wav2vec(Speech Recoginition)
1
bf16
23.14 token/sec
172.83 ms
4
speech-recoginition
Optimum Habana 1.9.0
Gaudi Reference Models Inference Performance
Model
# HPU
Precision
Throughput
Latency***
Batch Size
Framework Version
Bloom-176B-BeamSearch-8
16
bf16
10.51 token/sec
95.1 ms
1
DeepSpeed 0.12.4
Bloom-176B-Greedy
16
bf16
11.92 token/sec
83.38 ms
1
DeepSpeed 0.12.4
Bloom-176B-Sampling
16
bf16
7.98 token/sec
125.26 ms
1
DeepSpeed 0.12.4
Bloom-7B (512 token)
1
bf16
42.84 token/sec
23.34 ms
1
Stable Diffusion v2.1 (512x512)
1
bf16
0.36 img/sec
2777.77 ms
1
Lightning 2.1.2
Stable Diffusion v2.1 (768X768)
1
bf16
0.13 img/sec
7692.3 ms
1
Lightning 2.1.2
Bert
1
bf16
147.17 token/sec
163.12 ms
24
Unet2D
1
bf16
1364.2 img/sec
46.9 ms
64
Lightning 2.1.2
Unet3D
1
bf16
52.68 img/sec
37.96 ms
2
Lightning 2.1.2
Hugging Face Optimum Habana Gaudi Inference Performance
See the Examples page for information on how to run each of the Tasks, including model naming and hyperparameter usage.
Model
# HPU
Precision
Throughput
Latency
Batch
Task
Framework Version
BERT
1
bf16
39.55 token/sec
101.12 ms
4
language-modeling
Optimum Habana 1.9.0
BERT
1
bf16
126.85 token/sec
63.06 ms
8
question-answering
Optimum Habana 1.9.0
BERT
1
bf16
107.76 token/sec
74.23 ms
8
text-classification
Optimum Habana 1.9.0
BART-Greedy
1
bf16
2.96 token/sec
675.67 ms
2
summarization
Optimum Habana 1.9.0
ESMFold
1
bf16
14.17 token/sec
70.54 ms
1
protein-folding
Optimum Habana 1.9.0
Stable Diffusion v2.1 image size 512x512
1
bf16
0.35 token/sec
11173.18 ms
4
text to image generation
Optimum Habana 1.9.0
T5-Small Translation Greedy
1
bf16
15.48 token/sec
258.36 ms
4
translation
Optimum Habana 1.9.0
Wav2Vec 2.0 ASR
1
bf16
529.7 token/sec
7.68 ms
4
speech-recognition
Optimum Habana 1.9.0
Wav2Vec 2.0 Speech Classification
1
bf16
9.39 token/sec
425.62 ms
4
speech-recognition
Optimum Habana 1.9.0
*** For the Large Language Models, this is the average next token latency
System Configuration:
Gaudi® Platform System: HLS-1 with eight Habana Gaudi HL-205 Mezzanine cards and two Intel® Xeon® Platinum 8280 CPU @ 2.70GHz, and 756GB of System Memory
Gaudi®2 Platform System: HLS-Gaudi2 with eight Habana Gaudi2 HL-225H Mezzanine cards and two Intel® Xeon® Platinum 8380 CPU @ 2.30GHz, and 1TB of System Memory
Amazon EC2 DL1 Instance System: Custom Server with eight Habana Gaudi HL-205 Mezzanine cards and two Intel® Xeon® Platinum 8275CL CPU @ 3.00GHz, and 756GB of System Memory
Common Software Ubuntu22.04, SynapseAI Software version 1.14.0-493 PyTorch: Models run with PyTorch v2.1.1 use this Docker image Environment: These workloads are run using the Docker images running directly on the Host OS
Performance varies by use, configuration and other factors. Please refer to the Model-References GitHub page for each model’s support and validation coverage. All information provided here is subject to change without notice. Habana Labs may make changes to its test conditions and internal reliability goals at any time. Contact your Habana Labs representative to obtain the latest Habana Labs product specifications and roadmaps. Your costs and results may vary.
Sign up for the latest Habana developer news, events, training, and updates.