kernels.quantize

kernels.quantize

Dequantization utilities for bitsandbytes and FP8 integration.

Functions

Name	Description
dequantize	NF4 / FP8 dequantization; under `torch.compile` NF4 dispatches via `torch.ops.axolotl.nf4_dequantize`.
dequantize_fp8	Dequantize FP8 block-quantized weights: W_dequant = W_fp8 * scale_inv.

kernels.quantize.dequantize(W, quant_state=None)

NF4 / FP8 dequantization; under torch.compile NF4 dispatches via torch.ops.axolotl.nf4_dequantize.

kernels.quantize.dequantize_fp8(W, scale_inv, dtype=torch.bfloat16)

Dequantize FP8 block-quantized weights: W_dequant = W_fp8 * scale_inv.