kernels.quantize

kernels.quantize

Dequantization utilities for bitsandbytes and FP8 integration.

Functions

Name Description
dequantize NF4 / FP8 dequantization; under torch.compile NF4 dispatches via torch.ops.axolotl.nf4_dequantize.
dequantize_fp8 Dequantize FP8 block-quantized weights: W_dequant = W_fp8 * scale_inv.

dequantize

kernels.quantize.dequantize(W, quant_state=None)

NF4 / FP8 dequantization; under torch.compile NF4 dispatches via torch.ops.axolotl.nf4_dequantize.

dequantize_fp8

kernels.quantize.dequantize_fp8(W, scale_inv, dtype=torch.bfloat16)

Dequantize FP8 block-quantized weights: W_dequant = W_fp8 * scale_inv.