kernels.quantize
kernels.quantize
Dequantization utilities for bitsandbytes and FP8 integration.
Functions
| Name | Description |
|---|---|
| dequantize | NF4 / FP8 dequantization; under torch.compile NF4 dispatches via torch.ops.axolotl.nf4_dequantize. |
| dequantize_fp8 | Dequantize FP8 block-quantized weights: W_dequant = W_fp8 * scale_inv. |
dequantize
kernels.quantize.dequantize(W, quant_state=None)NF4 / FP8 dequantization; under torch.compile NF4 dispatches via torch.ops.axolotl.nf4_dequantize.
dequantize_fp8
kernels.quantize.dequantize_fp8(W, scale_inv, dtype=torch.bfloat16)Dequantize FP8 block-quantized weights: W_dequant = W_fp8 * scale_inv.