Dive into advanced quantization techniques. Learn to implement and customize linear quantization functions, measure quantization error, and compress model weights using PyTorch for efficient and accessible AI models.
machine-learning
quantization
model-compression
linear-quantization
quantization-error
ai-optimization
advanced-quantization
symmetric-quantization
asymmetric-quantization
per-tensor-granularity
per-channel-granularity
per-group-granularity
pytorch-quantizer
weight-packing
8-bit-compression
2-bit-weights
-
Updated
May 22, 2024 - Jupyter Notebook