2023/10/10 8.0.0 RC 2 8.0.100-rc.2.23502.2 2023/09/12 8.0.0 RC 1 8.0.100-rc.1.23463.5 2023/08/08 8.0.0 Preview 7 8.0.100-preview.7.23376.3 2023/07/11 8.0.0 Preview 6 ...
Abstract: We present an 8-bit floating-point (FP8) training processor which implements (1) highly parallel tensor cores (fused multiply-add trees) that maintain high utilization throughout forward ...