It would be nice, but we shouldn't lose sight of the fact that ARM have been working on their own, albeit inferior, MAC-based NPU:Holy Qaucomole! Renee Haas (Arm CEO) interview on Bloomberg four days ago. The interviewer asks him at 3.20 mins "Why not push Arm even closer to the heart of AI chip-making by designing an accelerator?"
You should see the expression on Renee's face! He-he-he!
He replies saying something like "Ahhhhhhhhhh, we could...Ahhhhhhh.... Not talking about products that we haven't leaned forward in terms of unannounced devices.The way I think about the market today is that we're doing the IP and then going foward we're doing what we call subsystems, that's talking blocks of IP and having them be a finished solution: a GPU and accelerator atc, etc. The next step forward we'lll be doing a bit more on that, maybe something physical. We haven't talked publically about doing that, but is it in the spectrum of what we could do, you know quite possibly".
At 5.20 Renee is asked to explain how he sees Arm fitting into Softbank's very ambitious plans for AI. He replies saying in regards to Masayosi Son " that very device that he might be looking at on a hardware side or a software side is probably going to run with or through Arm so understanding what that synergy is, but more important how to accelerate it, I think that is the unique attribute that we bring to his portfolio".
Arm CEO Haas Shares Vision for Powering AI Revolution
Rene Haas, CEO, Arm discusses his vision for the semiconductor industry that is not only powering, but reshaping the future of computing with Bloomberg’s Tom Giles at Bloomberg Tech in San Francisco. (Source: Bloomberg)www.bloomberg.com
US2024028877A1 NEURAL PROCESSING UNIT FOR ATTENTION-BASED INFERENCE 20220721
[0066] At stage S 11 , the direct memory access element ( 112 ) of the NPU ( 106 ) fetches compressed projection matrices WQ , WK , and WV from the flash memory ( 102 ). The weight decoder ( 120 ) decodes the compressed matrices. The MAC engine ( 116 ) calculates query matrix Q, key matrix K, and value matrix V by multiplying the query projection matrix WQ , the key projection matrix WK , and the value projection matrix WV , by input matrix X.
This arrangement is suitable of natural language processing as it provides access to earlier speech to provide context via the DMA 112.
Of course, ARM will be fully aware of TeNNs, so does all the discussion about not making the Akida SoC so as not to anticipate a customer come back to ARM?
We joined ARM menagerie in May 2022. The patent was filed in July 2022. They would probably have been working on the patent before the BRN/ARM cooperation began because of the preliminary R&D. Akida 2 came later (March 2023?).
So we can assume ARM have known about TeNNs for about a year to 15 months or more. TeNNs patents were filed in June 2022, so BRN could have been discussing TeNNs under NDA from then onwards, so ARM may have known about TeNNs for almost 2 years.
Last edited: