FlashMLA: Efficient Multi-head Latent Attention Kernels
📊 Project Info
- Language
- C++
- Stars
- ⭐ 12,674
- Forks
- 1,045
- Today
- +1
- Ranking
- #14
- Collection
- Language
- Trending Date
- May 30, 2026
- Last Push
- 4/30/2026
FlashMLA: Efficient Multi-head Latent Attention Kernels