4位浮点数FP4
文章回顾了浮点数位宽从32位到64位的演变历史,并探讨了当前AI硬件中使用的4位浮点数FP4格式。这种极低位宽的浮点格式在保持一定精度的同时,显著提升了计算效率和内存利用率。
文章回顾了浮点数位宽从32位到64位的演变历史,并探讨了当前AI硬件中使用的4位浮点数FP4格式。这种极低位宽的浮点格式在保持一定精度的同时,显著提升了计算效率和内存利用率。
FP4 is a 4-bit floating point format that uses 1 sign bit, 2 exponent bits, and 1 mantissa bit. It has limited precision and dynamic range, making it suitable for specialized applications like AI inference where memory bandwidth is constrained.
USearch introduces FP8 (8-bit floating point) support for vector search and KV-caching, enabling more efficient memory usage and faster computations. The implementation allows for reduced storage requirements while maintaining search accuracy through quantization techniques.