Hi Fmf,
As someone once said:
"4 bits are enough".
...
and what an outlandish idea - binary NNs!
As PvdM has pointed out, the IBM simulation demonstrated that 4-bit deep-learning models in vision, speech, and language, lose little in comparison with 16-bit deep learning.
https://brainchip.com/4-bits-are-enough/
"T
o dive a little bit deeper into the value of 4-bit, in its 2020 NeurIPS paper IBM described the various pieces that are already present and how they come together. They prove the readiness and the benefit through several experiments simulating 4-bit training for a variety of deep-learning models in computer vision, speech, and natural language processing. The results show a minimal loss of accuracy in the models’ overall performance compared with 16-bit deep learning. The results are also more than seven times faster and seven times more energy efficient."
... and Akida does it with not a MAC in sight, just massively parallel neurons and skeletal sparsity.
As long ago as September 2022, ARM, Intel and Nvidia came to understand that:
“
Neural networks are a bit strange in that they are actually remarkably tolerant to relatively low precision,” said Richard Grisenthwaite, executive vice president and chief architect at Arm. “In our paper*, we showed you don’t need 32 bits of mantissa for precision. You can use only two or three bits, and four or five bits of exponent will give you sufficient dynamic range. You really don’t need the massive precision that was defined in 754 [ IEEE 754 floating-point scheme ], which was designed for finite element analysis and other highly precise arithmetic tasks.”
*
FP8 Formats for Deep Learning
Paulius Micikevicius,
Dusan Stosic,
Neil Burgess,
Marius Cornea,
Pradeep Dubey,
Richard Grisenthwaite,
Sangwon Ha,
Alexander Heinecke,
Patrick Judd,
John Kamalu,
Naveen Mellempudi,
Stuart Oberman,
Mohammad Shoeybi,
Michael Siu,
Hao Wu
Submission history
From: Paulius Micikevicius [
view email]
[v1] Mon, 12 Sep 2022 17:39:55 UTC (117 KB)
[v2] Thu, 29 Sep 2022 20:47:07 UTC (117 KB)
Thinking of mantissas and exponents locks the thought process and the mathematical implementation into MACs (Multiply Accumulate calculations).
It's like making a road safety code for dinosaurs.