I was on the Brainchip website and found where they were discussing Xilinx and have included an excerpt:
It’s under ”The challenge of building inferencing chips”.
https://brainchipinc.com/challenges-of-inferencing-chips/
Speed matters
The key factor here is throughput. “These are generally plugged-in devices. Power is always critical, and there is only so much dissipation you can afford. But in the hierarchy of systems, there are other things that come before power. Memory is certainly another big component of AI inference at the edge. How much memory and how much bandwidth you can sustain?”
For companies building these chips, market opportunities are flourishing. Geoff Tate, CEO of
Flex Logix, points to such markets as biomedical imaging for implementing AI in ultrasound systems, genomic systems, along with scientific imaging applications that require very high resolution and very high frame rate. Surveillance cameras for retail stores also are growing in use so retailers can extend the use of the cameras wired already into their servers to capture information such as how many customers are coming into the store, customer wait times, etc.
While many, if not most, inferencing chips are mainly CPU-based, Flex Logix uses some
embedded FPGA technology in its inferencing chip. “Companies like Microsoft use
FPGAs in their datacenter today. They’ve deployed FPGAs for some time. They’ve done it because they found workloads that are common in their datacenter for which they can write code that runs on the FPGA, and basically it will run faster at lower cost and power than if it ran on a processor,” Tate said.
This opens up a whole swath of new options. “If it runs faster on the Xilinx boards than on an Intel Xeon, and the price is better, the customer just wants throughput per dollar and the FPGA can do better,” said Tate. “In the Microsoft data center, they run their inference on FPGAs because the FPGA needs a lot of multiplier-accumulators and the Xeons don’t have them. Microsoft has shown for years that FPGA is good for inference.”
Flex Logix’s path to an inference chip started with a customer asking for an FPGA that was optimized for inferencing. “There was a time when FPGAs just had logic,” he said. “There were no multiplier-accumulators in them. That was in the ’80s, when Xilinx first came out with them. At a later point in time, all FPGAs had multiplier-accumulators in them, introduced primarily for signal processing. They were optimized in terms of their size and their function for signal processing applications. Those multiplier-accumulators are why Microsoft is doing inference using FPGAs, because FPGAs have a fair number of multiplier-accumulators,” Tate explained.
Then development teams started using GPUs for inferencing, because they also have a lot of multipliers and accumulators. But they weren’t optimized for inference, although Nvidia has been slowly optimizing that. Flex Logix’s customer asked the company to change its FPGA in two ways — change all the MACs from 22-bit to 8-bit, and throw away all the extra bits and make a smaller multiplier-accumulator. The second request, given that the MAC was smaller and more could be fit into the same area, was to allocate more area to MACs.
“We’ll find out over this next year which of the architectures actually deliver better throughput per dollar, or the throughput per watt, and those will be the winners,” Tate said. “The customer doesn’t care which one wins. To them it’s just a piece of silicon. They put in their neural model, the software does the magic to make the silicon work, and they don’t care what’s inside as long as the answers come out, at high throughput, and the price and power are right.”
Looks like they decided on which way to go and it involved Xilinx and it that’s the case it could point towards “Explosive growth“.
A previous post about Leddartech (involved with Mercedes) indicated a supplier of theirs was Renasas and their FPGA supplier was Xilinx. How good is that for a link?
Good times ahead!