cTuning & MLCommons Collective Knowledge Challenges

[ Back to projects ]
We organize open benchmarking, optimization and reproducibility challenges in collaboration with ACM, MLCommons, IEEE, NeurIPS, HiPEAC and the community since 2014. Our goal is to connect industry, academia, students and the community to learn how to build and run AI, ML and other emerging workloads in a more efficient and cost-effective way (cost, latency, througput, accuracy, energy usage, size, etc) across diverse and rapidly evolving models, datasets, software and hardware using a common automation framework with technology-agnostic automation recipes for MLOps and MLPerf. Learn more about our initiatives and long-term goals from our ArXiv white paper, ACM REP keynote, ACM TechTalk and our Artifact Evaluation website.

Completed challenges and events

1) Participate in the MLPerf inference v5.0 submission
2) Participate in the 3rd ACM Conference on Reproducibility and Replicability (ACM REP 2025)
3) Run and optimize the MLPerf inference benchmark using CM at the Student Cluster Competition'24 at SuperComputing'24
4) Reproduce results from the accepted ACM/IEEE MICRO'24 papers (artifact evaluation)
5) Add MLCommons CM workflows and unifed interface to automate MLPerf inference v3.1 and v4.0 benchmarks (Intel, Nvidia, Qualcomm, Arm64, TPU ...)
6) Organize the the HiPEAC Reproducibility Student Challenge 2024 and automate using the CM framework
7) Reproduce and optimize MLPerf inference v3.1 benchmarks at the Student Cluster Competition'23 at SuperComputing'23 using CM
8) Work with the community to find the most efficient CPUs (Intel/AMD/Arm) for BERT and MobileNets/EfficientNets (latency, throughput, accuracy, number of cores, frequency, memory size, cost and other metrics)
9) Work with the community to find the most efficient Nvidia GPUs for GPT-J 6B model and BERT (latency, throughput, accuracy, number of cores, frequency, memory size, cost, and other metrics)
10) Add the CM interface to run MLPerf inference benchmarks on Intel-based platforms
11) Add the CM interface to run MLPerf inference benchmarks on Qualcomm AI100-based platforms
12) Add more models and hardware backends to the universal C++ implementation of MLPerf inference benchmarks from MLCommons
13) Crowd-benchmark all MLPerf inference benchmarks similar to SETI@home (latency, throughput, power consumption, accuracy, costs)
14) Develop a reference implementation of any MLPerf inference benchmark to run on Amazon Inferentia and submit to MLPerf inference v3.1+
15) Develop a reference implementation of any MLPerf inference benchmark to run on the latest publicly available Google TPU (GCP or Coral USB accelerator) and submit to MLPerf inference v3.1+
16) Implement CM automation to run benchmark Hugging Face models using MLPerf loadgen
17) Reproduce results from the NeurIPS 2022 paper for fast BERT pruning and integrate with the MLPerf inference benchmark
18) Run and optimize MLPerf inference v3.1 benchmarks on Windows
19) Run and optimize MLPerf inference v3.1 benchmarks with Apache TVM
20) Run and optimize MLPerf inference v3.1 benchmarks with Neural Magic's DeepSparse library
21) Run reference implementations of MLperf inference benchmarks using Mojo language from Modular.ai
22) Reproduce results from published ACM/IEEE MICRO'23 papers (artifact evaluation)
23) Reproduce MLPerf training v3.0 benchmarks
24) Reproduce and optimize TinyMLPerf inference v1.1 benchmarks
25) Reproduce results from the IPOL'22 journal paper using the CM automation framework (proof-of-concept)
26) Reproduce MLPerf inference v3.0 results for Nvidia Jetson Orin
27) Run and optimize MLPerf inference v3.0 benchmarks
28) Automate MLPerf RetinaNet benchmark at the Student Cluster Competition at SuperComputing'22 using CM
29) Run and optimize MLPerf inference v2.1 benchmarks
30) Reproduce results from published papers at ACM ASPLOS'22 (artifact evaluation)
31) Reproduce results from published ACM/IEEE MICRO'21 papers (artifact evaluation)
32) Reproduce results from published papers at ACM ASPLOS'21 (artifact evaluation)
33) Reproduce results from published papers at ACM ASPLOS'20 (1st artifact evaluation)
34) Reproduce results from published papers at MLSys'20 (artifact evaluation)
35) Organize reproducibility initiative at ACM/IEEE SuperComputing'19 with our unified artifact appendix and checklist
36) Reproduce benchmarking results at the Student Cluster Competition at ACM/IEEE SuperComputing'19 using Collective Knowledge
37) Reproduce results from published papers at SysML'19 (1st artifact evaluation)
38) Reproduce results from published papers at PPoPP'19 (artifact evaluation)
39) Organize FOSDEM'19 session about open reproducibility and optimization challenges
40) Organize Quantum Collective Knowledge hackathon at Ecole 42 in Paris
41) Organize reproducible quantum hackathons automated by Collective Knowledge
42) Organize the 1st Reproducible Quality-Efficient Systems Tournaments (ACM REQUEST at ASPLOS'18)
43) Reproduce results from published papers at ACM REQUEST-ASPLOS'18 (artifact evaluation)
44) Reproduce results from published papers at ACM/IEEE CGO'18 (artifact evaluation)
45) Reproduce results from published papers at PPoPP'18 (artifact evaluation)
46) Reproduce results from published papers at IA3 workshop at ACM/IEEE Supercomputing'17 (artifact evaluation)
47) Reproduce results from published papers at PACT'17 (artifact evaluation)
48) Reproduce results from published papers at ACM/IEEE CGO'17 and PPoPP'17 (artifact evaluation)
49) Reproduce results from published papers at RTSS'16 (artifact evaluation)
50) Reproduce results from published papers at ACM/IEEE SuperComputing'16 (artifact evaluation)
51) Reproduce results from published papers at PACT'16 (artifact evaluation)
52) Reproduce results from published papers at ACM/IEEE CGO'16 and PPoPP'16 (artifact evaluation)
53) Reproduce results from published papers at ADAPT'16 (artifact evaluation)
54) Organize Dagsthul Perspectives Workshop on Artifact Evaluation for Publications
55) Reproduce results from published papers at ACM/IEEE CGO'15 and PPoPP'15 (artifact evaluation)
56) Reproduce results from published papers at ADAPT'15 (artifact evaluation)
57) Organize the 1st ACM Workshop on Reproducible Research Methodologies and New Publication Models in Computer Engineering (TRUST'14 at PLDI'14)
58) Reproduce results from published papers at ADAPT'14 (artifact evaluation)