Cheng Luo

Cheng Luo

I am a researcher focusing on LLM optimization. Before that, I was an research at Microsoft Research. I am also collaborating with Anima Anandkumar. at Caltech, Beidi Chen at CMU. I am currently a researcher at TikTok

I am interested in bridging hardware constraints with the principles of LLM. I focus on developing efficient reasoning, training and inference. Check out my research for more details.

Additionally, I am a passionate community builder, which I founded the Efficient Reasoning workshops.

news

Mar, 2025 R-KV is released and try it out 🙌
Feb, 2025 HeadInfer is released and try it out 🙌
Jul, 2024 MsT is released and try it out 🙌
Mar, 2024 RTP is released and try it out 🙌
Google Scholar GitHub