Open date: 2023 Jul 4
Closing date: 2023 Aug 17
Collective Knowledge Contributor award: Yes
Reproduce and automate NeurIPS 2022 paper "A Fast Post-Training Pruning Framework for Transformers" using the CM automation language.
Convert models to ONNX format acceptable by the MLPerf BERT inference benchmark.
Create multiple BERT variations in ONNX format for the MLPerf inference v3.1 submission with different levels of sparsity.
Upload to the cTuning space at Hugging Face.
Run MLPerf inference v3.1 with all BERT variations on any platform and submit results to MLPerf inference v3.1.
Read this documentation to run reference implementations of MLPerf inference benchmarks using the CM automation language and use them as a base for your developments.
Check this ACM REP'23 keynote to learn more about our open-source project and long-term vision.