core.trainers.utils

core.trainers.utils

Utils for Axolotl trainers

Functions

Name Description
trainable_tokens_per_sec_per_gpu Effective per-GPU trainable-token throughput over a logging window.

trainable_tokens_per_sec_per_gpu

core.trainers.utils.trainable_tokens_per_sec_per_gpu(
    prev_trainable,
    curr_trainable,
    world_size,
    elapsed,
)

Effective per-GPU trainable-token throughput over a logging window.

curr_trainable/prev_trainable are the cumulative trainable-token counter (SUM-reduced across all ranks) at this log and the previous one, so the delta covers every gradient-accumulation microbatch and the elapsed wall time captures in-window overhead. Returns None when there is no prior window.