Attn-QAT: 4-Bit Attention With Quantization-Aware Training
Published in Preprint, 2025
Attn-QAT improves low-bit attention quality through quantization-aware training.
Recommended citation: Peiyuan Zhang*, Matthew Noto*, Wenxuan Tan*, Chengquan Jiang, Will Lin, Wei Zhou, Hao Zhang. (2025). "Attn-QAT: 4-Bit Attention With Quantization-Aware Training." Preprint.
Download Paper
