Topic Hub
Quantization, FP8 & Low-Precision AI Systems
FP16 vs BF16 vs FP8 Runtime Behavior
NVFP4 and Blackwell FP4 Systems
MXFP4 Microscaling Architectures
Dynamic Scaling Factors
Weight-Only Quantization
Activation Quantization
AWQ Quantization Systems
GPTQ Quantization
MR-GPTQ Runtime Optimization
SmoothQuant Outlier Suppression
Hadamard Outlier Mitigation
KV Cache Quantization
Low-Precision Matrix Multiplication
Phase-Aware Quantization (Mix-Quant)
Calibration and Accuracy Recovery