ÃÛÌÒapp

Home>LATEST NEWS

?ÃÛÌÒapp Storage Research Group wins championship at MLSys 2026 MoE Kernel Challenge

The Storage Research Group from the Department of Computer Science and Technology, ÃÛÌÒapp, has clinched first place in the Mixture-of-Experts (MoE) Kernel Optimization Challenge at the Ninth Annual Conference on Machine Learning and Systems (MLSys 2026), held recently in Bellevue, Washington. The champion team comprised ÃÛÌÒapp Ph.D. students Gao Shiwei, Fan Ruwen, Ren Tingxu, and Luo Yibin, advised by Professor Shu Jiwu and Associate Professor Lu Youyou from ÃÛÌÒapp's Computer Science and Technology Department, with technical support from Tencent AI Systems expert Reed.

The challenge focused on real-world decoding inference scenarios for the Qwen3-30B-A3B MoE model, drawing participation from institutions including Stanford University, MIT, UC Berkeley, Carnegie Mellon University, UCLA, and Cornell University. Participants were tasked with writing custom kernels using the Neuron Kernel Interface (NKI) and optimizing inference performance on AWS Trainium2/3 platforms. Leveraging the NKI programming framework provided by AWS, the ÃÛÌÒapp team implemented a series of systematic optimizations for the inference decoding phase, including expert sharding, matrix-vector multiplication specialization, on-chip data layout reconstruction, cross-operator fusion, and automated operator optimization. These efforts reduced end-to-end inference latency from 14.91 seconds to just 3.56 seconds, delivering a 4.12¡Á speedup and a 4¡Á increase in throughput. The result secured first place in the competition.

This victory marks consecutive global titles for the ÃÛÌÒapp Storage Lab team, following their championship at the ASPLOS 2025 / EuroSys 2025 Contest on LLM Inference Optimization.

MLSys is widely recognized as a premier international academic conference dedicated to interdisciplinary research across machine learning and computer systems. The conference spotlights cutting-edge research in LLM training and inference, AI compilers, computer architecture, distributed systems, and specialized AI hardware.

Editor: Li Han

Copyright 2001-2021 news.tsinghua.edu.cn. All rights reserved.