Fact Finder
Top 20
“To that end, Asanovic noted that on the X280 with the new VCIX interface, the X280 is capable of sending 1,024 bits over onto the accelerator/external component each cycle and retrieving 512 bits per cycle, every cycle sustained over the VCIX interface. [Per cycle @ 300 MHz, for example].”Hi Fmf,
I think this bit is particularly interesting, especially the choice of example of a "hardware start-up with a novel way of processing neural networks".
To assist customers with such applications, SiFive developed the new Vector Coprocessor Interface Extension (VCIX, pronounced “Vee-Six”). VCIX allows for tight coupling between the customer’s SoC/accelerator and the X280. For example, consider a hardware AI startup with a novel way of processing neural networks or one that has designed a very large computational engine. Instead of designing a custom sequencer or control unit, they can simply use the X280 as a drop-in replacement. With VCIX, they are given direct connections to the X280. The interface includes direct access into the vector unit and memory units as well as the instruction stream, allowing an external circuit to utilize the vector pipeline as well as directly access the caches and vector register file.
View attachment 25865
VCIX is designed to interface the NN to the SiFive X280.
The VCIX is a high-performance direct-coupling interface to the X280 and its instruction stream. To that end, Asanovic noted that on the X280 with the new VCIX interface, the X280 is capable of sending 1,024 bits over onto the accelerator/external component each cycle and retrieving 512 bits per cycle, every cycle sustained over the VCIX interface. [Per cycle @ 300 MHz, for example].
On the other hand, Google seem reluctant to abandon their in-house MXU and TPU:
Cliff Young, Google TPU Architect, and MLPerf Co-Founder was also part of the SiFive announcement. As we’ve seen from other Google accelerators, their hardware team always looks to eliminate redundant work by utilizing off-the-shelf solutions if it doesn’t add any real value to design it themselves in-house.
For their own TPU accelerators, beyond the inter-chip interconnect and their highly-refined Matrix Multiply Unit (MXU) which utilizes a systolic array, much of everything else is rather generic and not particularly unique to their chip. Young noted that when they started 9 years ago, they essentially built much of this from scratch, saying “scalar and vector technologies are relatively well-understood. Krste is one of the pioneers in the vector computing areas and has built beautiful machines that way. But should Google duplicate what Krste has already been doing? Should we be reinventing the wheel along with the Matrix Multiply and the interconnect we already have? We’d be much happier if the answer was ‘no’. If we can focus on the stuff that we do great and we can also reuse a general-purpose processor with a general-purpose software stack and integrate that into our future accelerators.” Young added, “the promise of VCIX is to get our accelerators and our general-purpose cores closer together; not far apart across something like a PCIe interface with 1000s of cycles of delay but right next to each other with just a few 100s of cycles through the on-chip path and down to 10s of cycles through direct vector register access.”
Google are still dabbling with analog SNNs:
WO2020077215A1 TEMPORAL CODING IN LEAKY SPIKING NEURAL NETWORKS
View attachment 25866
Spiking neural networks that perform temporal encoding for phase-coherent neural computing are provided. In particular, according to an aspect of the present disclosure, a spiking neural network can include one or more spiking neurons that have an activation layer that uses a double exponential function to model a leaky input that an incoming neuron spike provides to a membrane potential of the spiking neuron*. The use of the double exponential function in the neuron's temporal transfer function creates a better defined maximum in time. This allows very clearly defined state transitions between "now" and the "future step" to happen without loss of phase coherence.
May be time for a little morphic resonance.
Footnote: Wonder if Cliff Young still has those gumboots?
Now wasn’t it Edge Impulse who was comparing AKIDA running at 300 MHz with a GPU running at 900 MHz???
My opinion only DYOR
FF
AKIDA BALLISTA