Interesting side bit on Quadric.
Just read this that they released only couple of weeks ago.
Couple snips below and it was the paragraph about VIT and most current NPUs I found interesting given Quadric and us both run VIT now.
Should hopefully be a good enhancement for Akida Gen 2 you'd expect.
Trust its impending release as stated by BRN makes some inroads.
Availability:
Engaging with lead adopters now. General availability in Q3’ 2023
Megachips should be happy
Quadric Announces Vision Transformer Support for Chimera GPNPUs News
www.design-reuse-china.com
Quadric 的Vision Transformer 现支持 Chimera GPNPU
Quadric Announces Vision Transformer Support for Chimera GPNPUs
Jun. 21, 2023 –
High-performance implementation of ViT family of ML networks available for SoC designs
Burlingame, CA – June 20, 2023 – Quadric® today announced that its ChimeraTM general purpose neural processing unit (GPNPU) processor intellectual property (IP) supports vision transformer (ViT) machine learning (ML) inference models. These newer ViT models are not supported by almost all NPUs currently in production, making it impractical to run ViT on many existing edge AI system-on-chip (SoC) devices.
ViT models are the latest state-of-the-art ML models for image and vision processing in embedded systems. ViTs were first described in 2021 and now represent the cutting edge of inference algorithms in edge and device silicon. ViTs repeatedly interleave MAC-heavy operations (convolutions and dense layers) with DSP/CPU centric code (Normalization, SoftMax). The general-purpose architecture of the Chimera core family intermixes integer multiply-accumulate (MAC) hardware with a general purpose 32-bit ALU functionality, which enabled Quadric to rapidly port and optimize the ViT_B transformer model.....
Existing NPUs Cannot Support Transformers
Most edge silicon solutions in the market today employ heterogeneous architectures that pair convolution NPU accelerators with traditional CPU and DSP cores. The majority of NPUs in silicon today were designed three or more years ago when the ResNet family of convolution networks was state of the art, and Vision Transformers had not yet taken the AL/ML world by storm. ResNet type networks have at most one normalization layer at the beginning of a model, and one SoftMax at the end, with a long chain of convolution operations making up the bulk of the compute. ResNet models therefore map very neatly into convolution-optimized NPU accelerator cores as part of heterogenous SoC chip architectures.
The emergence of ViT networks broke the underlying assumptions of these NPUs.
Mapping a ViT workload to a heterogeneous SoC would entail repeatedly moving data back and forth between NPU and DSP/CPU – 24 to 26 round trip data transfers for the base case ViT_B. The system power wasted with all those data transfers wipes out the matrix-compute efficiency gains from having the NPU in the first place.