cTuning & MLCommons Collective Knowledge Challenges

[ Back ]

Reproduce results from the NeurIPS 2022 paper for fast BERT pruning and integrate with the MLPerf inference benchmark

Open date: 2023 Jul 4

Closing date: 2023 Aug 17

Collective Knowledge Contributor award: Yes

Added CM interface

https://github.com/ctuning/cm4research/blob/main/script/reproduce-neurips-paper-2022-arxiv-2204.09656/README-extra.md

Challenge

Reproduce and automate NeurIPS 2022 paper "A Fast Post-Training Pruning Framework for Transformers" using the CM automation language.

Convert models to ONNX format acceptable by the MLPerf BERT inference benchmark.

Create multiple BERT variations in ONNX format for the MLPerf inference v3.1 submission with different levels of sparsity.

Upload to the cTuning space at Hugging Face.

Run MLPerf inference v3.1 with all BERT variations on any platform and submit results to MLPerf inference v3.1.

Read this documentation to run reference implementations of MLPerf inference benchmarks using the CM automation language and use them as a base for your developments.

Check this ACM REP'23 keynote to learn more about our open-source project and long-term vision.

Prizes

All contributors will receive 1 point for submitting valid results for 1 complete benchmark on one system.
All contributors will receive an official MLCommons Collective Knowledge contributor award (see this example).

Organizers

Initial discussion and materials

ArXiv
Code

Self link