Nvidia breakthrough gives 4-bit pretraining technique the accuracy of FP8

@[email protected]

I thought fp4 was for quantization only. Is it for training now too?

☆ Yσɠƚԋσʂ ☆

looks like and without loss of quality supposedly

@[email protected]

next generation of frontier models

lol. Too much grifter speak for me. Slow down on that kool aid.

☆ Yσɠƚԋσʂ ☆

People building their whole identity around hating LLM tech will never stop being hilarious.

@[email protected]

iTs jUsT a PaTtErN mAcHiNe

queermunist she/her

The math is 62% accurate? Is that what that’s saying?

☆ Yσɠƚԋσʂ ☆

In this context, accuracy is a metric that measures the percentage of questions the model answered correctly on the MMLU-Pro benchmark. So, it’s not math specifically being 62% accurate, but the overall ability of the model to converge on a correct answer.

Nvidia breakthrough gives 4-bit pretraining technique the accuracy of FP8

Nvidia breakthrough gives 4-bit pretraining technique the accuracy of FP8

Pretraining Large Language Models with NVFP4

Technology

Nvidia breakthrough gives 4-bit pretraining technique the accuracy of FP8plus-square

Nvidia breakthrough gives 4-bit pretraining technique the accuracy of FP8plus-square

Pretraining Large Language Models with NVFP4

Technology

Nvidia breakthrough gives 4-bit pretraining technique the accuracy of FP8

Nvidia breakthrough gives 4-bit pretraining technique the accuracy of FP8