Skip to content

Latest commit

 

History

History
107 lines (80 loc) · 2.72 KB

MLPerf_Compatibility_Table.adoc

File metadata and controls

107 lines (80 loc) · 2.72 KB
Table of Contents

N/A : Benchmark not present in a round

X: Change in benchmark. Submission results can be compared across rounds when there has been no change in the benchmark

Training

Model

0.5

0.6

0.7

1.0

1.1

2.0

2.1

3.0

3.1

4.0

4.1

ResNet-50 v1.5

X

X

N/A

SSD-ResNet34

X

X

N/A

RetinaNet-ResNeXt50

N/A

X

MaskRCNN

X

X

N/A

NCF

X

N/A

NMT

X

X

N/A

Transformer

X

X

N/A

MiniGo

X

X

X

N/A

DLRM

N/A

X

N/A

DLRM-dcnv2

N/A

X

BERT

N/A

X

RNN-T

N/A

X

X

N/A

3D U-Net

N/A

X

N/A

GPT3

N/A

X

Stable Diffusionv2

N/A

X

LLama70B-LoRA

N/A

X

RGAT

N/A

X

Metric: Time-to-train (measured in minutes)

Note: v0.6 ResNet-50 v1.5, SSD-ResNet34, NMT increased accuracy targets, all v0.6 benchmarks changed initializition timing, and v0.7 MiniGo moved to 19x19 board

HPC

Model

0.7

1.0

2.0

CosmoFlow

X

X

X

DeepCAM

X

X

Open Catalyst

N/A

X

X

Metrics: Time-to-train (measured in minutes) and throughput (weak scaling - measured in models/minute)

Inference

Model

0.5

0.7

1.0

1.1

2.0

2.1

3.0

3.1

MobileNet-v1

X

N/A

ResNet-50 v1.5

X

SSD-MobileNets

X

SSD-ResNet34

X

N/A

RetinaNet-ResNeXt50

N/A

X

NMT

X

N/A

DLRM

N/A

X

N/A

DLRM-v2

N/A

X

BERT

N/A

X

RNN-T

N/A

X

3D U-Net

N/A

X

GPT-J

N/A

X

Metrics: Queries/second (server), Samples/second (offline), Latency (measured in milliseconds) (single stream), Streams (multi-stream v0.5-v1.1), Latency (measured in milliseconds) (multi-stream 2.0+)

Additional power metrics: System power (measured in watts) (server and offline), system energy per stream (measured in joules) (single stream and multi-stream)

Note: Performance metrics for inference and power submissions are not comparable

Note: Multistream v0.5-v1.1 is not compatible with v2.0 and newer

Note: Inference over Network scenario introduced in v2.1

Mobile

Model

0.7

1.0

1.1

2.0

2.1

3.0

MobileNetEdge

X

SSD-MobileNetsV2

X

N/A

MobileDET

N/A

X

DeeplabV3

X

N/A

MOSAIC

N/A

X

MobileBERT

X

EDSR

N/A

X

Primary metrics: Latency (measured in milliseconds) (single stream), Samples/second (offline)

Note: Submission requires all benchmarks in single stream and MobileNetEdge in single stream and offline

Tiny

Model

0.5

0.7

1.0

MobileNetV1

X

X

ResNet-V1

X*

X

DSCNN

X

X

FC Autoencoder

X

X

Primary metric: Latency (measured in milliseconds)

Secondary metric: Energy per inference (measured in microjoules)

*Latency Compatible, not accuracy: v0.5 and v0.7 use the same model, but changed the evaluation set to improve balance.