Cheng Luo

I am a independent researcher focusing on LLM optimization. Before that, I was an research at Microsoft Research. I am also collaborating with Anima Anandkumar. at Caltech, Beidi Chen at CMU. I am currently a postdoctoral researcher at caltech.

I am interested in bridging hardware constraints with the principles of learning in neural networks. I focus on developing hardware-efficient learning algorithms that are principled and scalable for large-scale training, such as training large language models (LLMs). Check out my research for more details.

Additionally, I am a passionate community builder, which I founded the Efficient Reasoning workshops.

news

Mar, 2025 R-KV is released and try it out 🙌

Feb, 2025 HeadInfer is released and try it out 🙌

Jul, 2024 MsT is released and try it out 🙌

Mar, 2024 RTP is released and try it out 🙌