BRN Discussion Ongoing

cosors

👀
Just to mention the name here, UBTECH Robotics from China.
I can't find any connections except that they partner with the Carnegie Mellon University. I became aware of them because of the Kickstarter project Ugot with a NPU. Maybe someone can find out which NPUs they use.
1697722829304.jpeg

UBTECH is the visionary company behind the innovative UGOT Robotic Kit. As a global leader in intelligent humanoid robotics and AI technology, UBTECH is dedicated to creating cutting-edge robotic solutions that enrich and inspire lives. Through research and development, UBTECH has pioneered breakthroughs in the field, delivering advanced products that combine state-of-the-art technology, creativity, and functionality. With a diverse portfolio of humanoid robots, educational robots, and AI-driven software, UBTECH continues to push the boundaries of what robots can achieve.


I do not mention this to make up any connection. Maybe someone can find something or finds interesting what they are doing. Interesting products with a wide range of applications. Maybe purely Chinese products?
https://www.ubtrobot.com/web2/template/index.shtml

More products and partners at the bottom but no technology partners it seems, telecommunication and research institutions and universities (I have checked the symbols):
https://commercial.ubtrobot.com/global
 
  • Like
  • Fire
Reactions: 11 users
This guy seems to like Akida a little :)


Pradeep R
Director Cloud,AI Infrastructure Engineering,Cloud Solutions Architect | Multi Cloud (AWS,Azure, Google Cloud )| Data Center,Edge | DevOps | FinOps |Cloud Migration | Cloud Modernization | Cloud Security | Sustainability
1w Edited

BrainChip Unveils Second-Generation Akida Platform for Edge AI Advancements In an era marked by an insatiable appetite for artificial intelligence (AI) capabilities, BrainChip, a pioneer in neural network processors, has taken a significant stride towards empowering edge devices with unprecedented processing power. The company’s latest unveiling, the second-generation Akida platform, represents a groundbreaking leap in the realm of Edge AI, delivering the potential to liberate devices from cloud dependency. The initial glimpse of BrainChip’s original Akida neuromorphic processing technology at the Linley Fall Processor Conference in 2019 paved the way for a journey that materialized in the form of development kits for the general public in 2021. In March 2023, there was an announcement of Akida 2.0, a refinement promising support for Temporal Event-Based Neural Network (TENN) acceleration and optional vision transformer hardware. This enhancement not only amplifies the platform’s capabilities but also lightens the computational load on the host processor. BrainChip classified Akida 2.0 into three distinct product classes: Akida-E, prioritizing energy efficiency; Akida-S, designed for seamless integration into microcontroller units and systems-on-chips; and Akida-P, a high-performance range supplemented by optional vision transformer acceleration. 🔥 We are also on WhatsApp. Join our AI Channel on Whatsapp.. Now, BrainChip has initiated an “early access” program, granting access to the Akida 2.0 intellectual property (IP) and promising an “order of magnitude” improvement in compute density through TENNs support. This transformative leap is a testament to the inevitable shift towards Multimodal Edge AI, a trend intensifying the demands on intelligent computing at the Edge. Researchers laud this development, emphasizing that BrainChip’s second-generation Akida aligns precisely with the critical requisites of performance, efficiency, accuracy, and reliability necessary to expedite this transition. Central to the Akida 2.0 platform are the TENNs, offering a staggering “order of magnitude” reduction in model size and computational requirements. This leap in efficiency augurs well for the acceleration of AI adoption and holds the promise of rendering Edge AI solutions more accessible and easy to deploy. Please like, repost the content if you like, spread the knowledge.thanks.
BrainChip Unveils Second-Generation Akida Platform for Edge AI Advancements

BrainChip Unveils Second-Generation Akida Platform for Edge AI Advancements

https://www.marktechpost.com



Pradeep R​

Director Cloud,AI Infrastructure Engineering,Cloud Solutions Architect | Multi Cloud (AWS,Azure, Google Cloud )| Data Center,Edge | DevOps | FinOps |Cloud Migration | Cloud Modernization | Cloud Security | Sustainability​

Open Compute Project Foundation Acharya Nagarjuna University​

Bengaluru, Karnataka, India
2K followers 500+ connections

 
  • Like
  • Fire
  • Love
Reactions: 47 users

Tothemoon24

Top 20
IMG_7695.jpeg
Here’s why Quadric has made our Top 13 AI/ML Startups to Bet On Q4 Update:

CEO Veerbhan K. exemplifies focus and Quadric’s mission is to create ML-optimized processor IP that empowers rapid AI SoC development and model porting. This unwavering focus on efficiency is how Quadric is going to make AI more accessible, and responsible and democratize it.

It's also Kheterpal’s boldness to challenge norms that stands out. I needed no more evidence than his fearlessly posting a probing question to Andrew Ng, a luminary of AI, during a Q&A session at the AI Hardware Summit. Fortune favors the brave!

Let’s explore how he and Quadric are making AI responsible and democratizing it:

The landscape of Large Language Models is growing exponentially, creating great potential yet also creating the risk of negative consequences like models with bias and inaccuracies. It also brings to light the issue of skyrocketing and unsustainable energy consumption in training LLMs.

Only large companies can manage the rising costs of training and retraining models, contradicting the core principle of democratization.

However, Meta introduced the highly-anticipated Llama2 LLM in July which stands out because it’s open-source, free, and designed for both commercial AND research use. It’s also trained on significantly more parameters than other models, emphasizing safety and responsibility.

Meta launched Llama2 along with a groundbreaking announcement of partnering with Qualcomm, which will integrate Llama2 into next-gen Snapdragon chips in smartphones and laptops beginning next year. This is considered a milestone since LLMs have been considered viable only in data centers with access to vast power resources.

Yet Quadric doesn’t view this through rose-colored lenses. CMO Steve Roddy voiced a contrarian perspective, asking, “Why would titans of the semiconductor and IP worlds need to wait until 2024 or 2025 or beyond to support today’s newest, hottest ML model?”

With the rate of change in LLMs and vision models intensifying, the reality is most accelerators designed for AI at the Edge would require a respin for each evolution. Even FPGAs like GPUs require more power than is suitable for Edge applications.

Quadric’s approach is different. Their general-purpose Neural Processing Unit, known as "Chimera," combines field programmability, like a GPU, with a power-performance profile that makes Edge AI feasible across a variety of consumer devices. What’s more, they support this blend of programmability and performance with a dedicated Developer Studio to significantly expedite the porting process.

Quadric’s emphasis on efficiency, driven by Kheterpal’s leadership, not only empowers developers but also paves the way with fewer hurdles and faster time-to-market at reduced costs; leaving us with no doubt that Quadric is playing a pivotal role in making AI genuinely accessible to all.
 
  • Like
  • Fire
Reactions: 12 users

equanimous

Norse clairvoyant shapeshifter goddess
  • Haha
Reactions: 4 users

cosors

👀
I wish my other two CEOs were on like Coby.
What is our terribly intelligent AI thinking up here 🤔😅
Screenshot_2023-10-19-21-46-30-99_40deb401b9ffe8e1df2f1cc5ba480b12.jpg

or is this your work @zeeb0t you rogue? :D

____
Musk has managed to destroy part of 100B with careless rash words. So much for CEOs.
 
Last edited:
  • Like
  • Fire
  • Wow
Reactions: 8 users

IloveLamp

Top 20
  • Like
  • Love
  • Fire
Reactions: 31 users

Dhm

Regular
View attachment 47474 Here’s why Quadric has made our Top 13 AI/ML Startups to Bet On Q4 Update:

CEO Veerbhan K. exemplifies focus and Quadric’s mission is to create ML-optimized processor IP that empowers rapid AI SoC development and model porting. This unwavering focus on efficiency is how Quadric is going to make AI more accessible, and responsible and democratize it.

It's also Kheterpal’s boldness to challenge norms that stands out. I needed no more evidence than his fearlessly posting a probing question to Andrew Ng, a luminary of AI, during a Q&A session at the AI Hardware Summit. Fortune favors the brave!

Let’s explore how he and Quadric are making AI responsible and democratizing it:

The landscape of Large Language Models is growing exponentially, creating great potential yet also creating the risk of negative consequences like models with bias and inaccuracies. It also brings to light the issue of skyrocketing and unsustainable energy consumption in training LLMs.

Only large companies can manage the rising costs of training and retraining models, contradicting the core principle of democratization.

However, Meta introduced the highly-anticipated Llama2 LLM in July which stands out because it’s open-source, free, and designed for both commercial AND research use. It’s also trained on significantly more parameters than other models, emphasizing safety and responsibility.

Meta launched Llama2 along with a groundbreaking announcement of partnering with Qualcomm, which will integrate Llama2 into next-gen Snapdragon chips in smartphones and laptops beginning next year. This is considered a milestone since LLMs have been considered viable only in data centers with access to vast power resources.

Yet Quadric doesn’t view this through rose-colored lenses. CMO Steve Roddy voiced a contrarian perspective, asking, “Why would titans of the semiconductor and IP worlds need to wait until 2024 or 2025 or beyond to support today’s newest, hottest ML model?”

With the rate of change in LLMs and vision models intensifying, the reality is most accelerators designed for AI at the Edge would require a respin for each evolution. Even FPGAs like GPUs require more power than is suitable for Edge applications.

Quadric’s approach is different. Their general-purpose Neural Processing Unit, known as "Chimera," combines field programmability, like a GPU, with a power-performance profile that makes Edge AI feasible across a variety of consumer devices. What’s more, they support this blend of programmability and performance with a dedicated Developer Studio to significantly expedite the porting process.

Quadric’s emphasis on efficiency, driven by Kheterpal’s leadership, not only empowers developers but also paves the way with fewer hurdles and faster time-to-market at reduced costs; leaving us with no doubt that Quadric is playing a pivotal role in making AI genuinely accessible to all.
Didn't @Diogenese put the pox on Quadric's way of doing things?
From their published patent, Quadric has a very clunky software controlled microprocessor/MAC array.

I don't know if they have done a ground-up redesign in the last 18 months - I suppose anything's possible, but ...


US2023083282A1 SYSTEMS AND METHODS FOR ACCELERATING MEMORY TRANSFERS AND COMPUTATION EFFICIENCY USING A COMPUTATION-INFORMED PARTITIONING OF AN ON-CHIP DATA BUFFER AND IMPLEMENTING COMPUTATION-AWARE DATA TRANSFER OPERATIONS TO THE ON-CHIP DATA BUFFER

1688919002224.png





0044] An array core 110 preferably functions as a data or signal processing node (e.g., a small microprocessor) or processing circuit and preferably, includes a register file 112 having a large data storage capacity (e.g., 1024 kb, etc.) and an arithmetic logic unit (ALU) 118 or any suitable digital electronic circuit that performs arithmetic and bitwise operations on integer binary numbers.



[0048] An array core 110 may, additionally or alternatively, include a plurality of multiplier (multiply) accumulators (MACs) 114 or any suitable logic devices or digital circuits that may be capable of performing multiply and summation functions. In a preferred embodiment, each array core 110 includes four (4) MACs and each MAC 114 may be arranged at or near a specific side of a rectangular shaped array core 110 .
 
  • Like
Reactions: 7 users

rgupta

Regular
Didn't @Diogenese put the pox on Quadric's way of doing things?
From their published patent, Quadric has a very clunky software controlled microprocessor/MAC array.

I don't know if they have done a ground-up redesign in the last 18 months - I suppose anything's possible, but ...


US2023083282A1 SYSTEMS AND METHODS FOR ACCELERATING MEMORY TRANSFERS AND COMPUTATION EFFICIENCY USING A COMPUTATION-INFORMED PARTITIONING OF AN ON-CHIP DATA BUFFER AND IMPLEMENTING COMPUTATION-AWARE DATA TRANSFER OPERATIONS TO THE ON-CHIP DATA BUFFER

1688919002224.png





0044] An array core 110 preferably functions as a data or signal processing node (e.g., a small microprocessor) or processing circuit and preferably, includes a register file 112 having a large data storage capacity (e.g., 1024 kb, etc.) and an arithmetic logic unit (ALU) 118 or any suitable digital electronic circuit that performs arithmetic and bitwise operations on integer binary numbers.



[0048] An array core 110 may, additionally or alternatively, include a plurality of multiplier (multiply) accumulators (MACs) 114 or any suitable logic devices or digital circuits that may be capable of performing multiply and summation functions. In a preferred embodiment, each array core 110 includes four (4) MACs and each MAC 114 may be arranged at or near a specific side of a rectangular shaped array core 110 .
Megachips is an investor in quadric and a customer of quadric ip.
 
  • Like
Reactions: 6 users

M_C

Founding Member
This will be worth a watch imo

1000006873.png
 
  • Like
  • Fire
Reactions: 13 users

Frangipani

Regular
IBM came out of stealth mode today with NorthPole, an extension of TrueNorth…



19 Oct 2023
News
6 minute read

A new chip architecture points to faster, more energy-efficient AI​

A new chip prototype from IBM Research’s lab in California, long in the making, has the potential to upend how and where AI is used efficiently.

image

A new chip prototype from IBM Research’s lab in California, long in the making, has the potential to upend how and where AI is used efficiently.

We’re in the midst of a Cambrian explosion in AI. Over the last decade, AI has gone from theory and small tests to enterprise-scale use cases. But the hardware used to run AI systems, although increasingly powerful, was not designed with today’s AI in mind. As AI systems scale, the costs skyrocket. And Moore’s Law, the theory that the density of circuits in processors would double each year, has slowed.

But new research out of IBM Research’s lab in Almaden, California, nearly two decades in the making, has the potential to drastically shift how we can efficiently scale up powerful AI hardware systems.

Since the birth of the semiconductor industry, computer chips have primarily followed the same basic structure, where the processing units and the memory storing the information to be processed are stored discretely. While this structure has allowed for simpler designs that have been able to scale well over the decades, it’s created what’s called the von Neumann bottleneck, where it takes time and energy to continually shuffle data back and forth between memory, processing, and any other devices within a chip. The work by IBM Research’s Dharmendra Modha and his colleagues aims to change this, taking inspiration from how the brain computes. “It forges a completely different path from the von Neumann architecture,” according to Modha.

Over the last eight years, Modha has been working on a new type of digital AI chip for neural inference, which he calls NorthPole. It’s an extension of TrueNorth, the last brain-inspired chip that Modha worked on prior to 2014. In tests on the popular ResNet-50 image recognition and YOLOv4 object detection models, the new prototype device has demonstrated higher energy efficiency, higher space efficiency, and lower latency than any other chip currently on the market, and is roughly 4,000 times faster than TrueNorth.

The first promising set of results from NorthPole chips were published today in Science. NorthPole is a breakthrough in chip architecture that delivers massive improvements in energy, space, and time efficiencies, according to Modha.
Using the ResNet-50 model as a benchmark, NorthPole is considerably more efficient than common 12-nm GPUs and 14-nm CPUs. (NorthPole itself is built on 12 nm node processing technology.) In both cases, NorthPole is 25 times more energy efficient, when it comes to the number of frames interpreted per joule of power required. NorthPole also outperformed in latency, as well as space required to compute, in terms of frames interpreted per second per billion transistors required. According to Modha, on ResNet-50, NorthPole outperforms all major prevalent architectures — even those that use more advanced technology processes, such as a GPU implemented using a 4 nm process.

How does it manage to compute with so much efficiency than existing chips? One of the biggest differences with NorthPole is that all of the memory for the device is on the chip itself, rather than connected separately. Without that von Neumann bottleneck, the chip can carry out AI inferencing considerably faster than other chips already on the market. NorthPole was fabricated with a 12-nm node process, and contains 22 billion transistors in 800 square millimeters. It has 256 cores and can perform 2,048 operations per core per cycle at 8-bit precision, with potential to double and quadruple the number of operations with 4-bit and 2-bit precision, respectively. “It’s an entire network on a chip,” Modha said.

IBM_NP_PCIe-PCB-Rear.png
The NorthPole chip on a PCIe card.

Architecturally, NorthPole blurs the boundary between compute and memory,” Modha said. "At the level of individual cores, NorthPole appears as memory-near-compute and from outside the chip, at the level of input-output, it appears as an active memory.” This makes NorthPole easy to integrate in systems and significantly reduces load on the host machine.

But the biggest advantage of NorthPole is also a constraint: it can only easily pull from the memory it has onboard. All of the speedups that are possible on the chip would be undercut if it had to access information from another place.
Via an approach called scale-out, NorthPole can actually support larger neural networks by breaking them down into smaller sub-networks that fit within NorthPole’s model memory, and connecting these sub-networks together on multiple NorthPole chips. So while there is ample memory on a NorthPole (or collectively on a set of NorthPoles) for many of the models that would be useful for specific applications, this chip is not meant to be a jack of all trades. “We can’t run GPT-4 on this, but we could serve many of the models enterprises need,” Modha said . “And, of course, NorthPole is only for inferencing.”
This efficacy means that the device also doesn’t need bulky liquid-cooling systems to run — fans and heat sinks are more than enough — meaning that it could be deployed in some rather small spaces.


Potential applications for NorthPole​

While research into the NorthPole chip is still ongoing, its structure lends itself to emerging AI use cases, as well as more well-established ones.

In testing, NorthPole team focused primarily on computer vision-related uses, in part because funding for the project came from the U.S. Department of Defense. Some of the primary applications in consideration were detection, image segmentation, and video classification. But it was also tested in other arenas, such as natural language processing (on the encoder-only BERT model) and speech recognition (on the DeepSpeech2 model). The team is currently exploring mapping decoder-only large language models to NorthPole scale-out systems.

When you think of these AI tasks, all sorts of fantastical use cases spring to mind, from autonomous vehicles, to robotics, digital assistants, or spatial computing. Many sorts of edge applications that require massive amounts of data processing in real time could be well-suited for NorthPole. For example, it could potentially be the sort of device that’s needed to move autonomous vehicles from machines that require set maps and routes to operate on a small scale, to ones that can think and react to the rare edge-case situations that make navigating in the real world so challenging even for proficient human drivers. These sorts of edge-cases are the exact sweet spot for future NorthPole applications. NorthPole could enable satellites that monitor agriculture and manage wildlife populations, monitor vehicle and freight for safer and less congested roads, operate robots safely, and detect cyber threats for safer businesses.

What’s next

This is just the start of the work for Modha on NorthPole. The current state of the art for CPUs is 3 nm — and IBM itself is already years into research on 2 nm nodes. That means there’s a handful of generations of chip processing technologies NorthPole could be implemented on, in addition to fundamental architectural innovations, to keep finding efficiency and performance gains.

BIC-Group-Photo_2023-08-10_no-caption.png
Modha, center, with most of the team working on NorthPole.

But for Modha, this is just one important milestone along a continuum that has dominated the last 19 years of his professional career. He’s been working on digital brain-inspired chips throughout that time, knowing that the brain is the most energy-efficient processor we know, and searching for ways to replicate that digitally. TrueNorth was fully inspired by the structures of neurons in the brain — and had as many digital “synapses” in it as the brain of a bee. But sitting on a park bench in 2015 in San Francisco, Modha said he was thinking through his work to date. He had the belief that there was something in marrying the best of traditional processing devices with the structure of processing in the brain, where memory and processing are interspersed throughout the brain. The answer was “brain-inspired computing, with silicon speed,” according to Modha.

Over the next eight years, Modha and his colleagues were single-minded and hermetic in their goal of turning this vision into a reality. Toiling inconspicuously in Almaden, the team didn’t give any lectures or publish any papers on their work, until this year. Each person brought different skills and perspective yet everyone collaborated so that as a whole the team’s contribution was much greater than the sum of the parts. Now, the plan is to show what NorthPole could do, while exploring how to translate the designs into smaller chip production processes and further exploring the architectural possibilities.

This work stemmed from simple ideas — how can we make computers that work like the brain — and after years of fundamental research, has come up with an answer. Something that is really only possible today at a place like IBM Research, where there is the time and space to explore the big questions in computing, and where they can take us. “NorthPole is a faint representation of the brain in the mirror of a silicon wafer,” Modha said.


Here is a 61 page PDF file for the techies…





E605D9B7-AD41-494A-91BF-8A30DEC58FE3.jpeg
 
  • Like
  • Fire
  • Thinking
Reactions: 23 users

Boab

I wish I could paint like Vincent
Didn't @Diogenese put the pox on Quadric's way of doing things?
From their published patent, Quadric has a very clunky software controlled microprocessor/MAC array.

I don't know if they have done a ground-up redesign in the last 18 months - I suppose anything's possible, but ...


US2023083282A1 SYSTEMS AND METHODS FOR ACCELERATING MEMORY TRANSFERS AND COMPUTATION EFFICIENCY USING A COMPUTATION-INFORMED PARTITIONING OF AN ON-CHIP DATA BUFFER AND IMPLEMENTING COMPUTATION-AWARE DATA TRANSFER OPERATIONS TO THE ON-CHIP DATA BUFFER

1688919002224.png





0044] An array core 110 preferably functions as a data or signal processing node (e.g., a small microprocessor) or processing circuit and preferably, includes a register file 112 having a large data storage capacity (e.g., 1024 kb, etc.) and an arithmetic logic unit (ALU) 118 or any suitable digital electronic circuit that performs arithmetic and bitwise operations on integer binary numbers.



[0048] An array core 110 may, additionally or alternatively, include a plurality of multiplier (multiply) accumulators (MACs) 114 or any suitable logic devices or digital circuits that may be capable of performing multiply and summation functions. In a preferred embodiment, each array core 110 includes four (4) MACs and each MAC 114 may be arranged at or near a specific side of a rectangular shaped array core 110 .
MegaChips announced an equity stake in Quadric in January 2022 and is also a major investor in a $21M Series B funding round announced in March through their MegaChips LSI USA Corporation subsidiary. The round aims to help Quadric release the next version of its processor architecture, improve the performance and breadth of the Quadric software development kit (SDK), and roll out IP products to be integrated in MegaChips' ASICs and SoCs.
 
  • Like
  • Fire
Reactions: 14 users

Frangipani

Regular
More on IBM’s NorthPole…



Microchip breakthrough may reshape the future of AI​

IBM’s new NorthPole may enable smarter, more efficient, network-independent devices that may even help the U.S. win the microchip war against China.​

PATRICK TUCKER
|
OCTOBER 19, 2023 02:23 PM ET
A prototype microchip design revealed today by IBM could pave the way for a world of much smarter devices that don’t rely on the cloud or the internet for their intelligence. That could help soldiers who operate drones, ground robots, or augmented-reality gear against adversaries who can target electronic emissions. But the new chip—modeled loosely on the human brain—also paves the way for a different sort of AI, one that doesn’t rely on big cloud and data companies like Amazon or Google.

Unlike traditional chips that separate memory from processing circuits, the NorthPole chip combines the two—like synapses in the brain that hold and process information based on their connection to other neurons. Writing in the journal Science, IBM researchers call it a “neural inference architecture that blurs this boundary by eliminating off-chip memory, intertwining compute with memory on-chip, and appearing externally as an active memory.”

Why is that important and what does it have to do with the future? Today’s computers have at least two characteristics that limit AI development.
First, they need a lot of power. Your brain, running on just 12 watts of power, can retain and retrieve the information you need have a detailed conversation while simultaneously absorbing, correctly interpreting, and making decisions about the enormous amount of sensory data required to drive a car. But a desktop computer requires 175 watts just to process the ones and zeros of an orderly spreadsheet. This is one reason why computer vision in cars and drones is so difficult, a huge limiting factor for autonomy. This energy inefficiency is one reason why many of today’s AI tools depend on enormous enterprise cloud farms that consume enough energy to power a small town.

The second problem is that we’re reaching the atomic limit of how many transistors we can fit on a chip.https://www.defenseone.com/ideas/20...gon-purchases/379823/?oref=d1-related-article

The NorthPole chip prototype may help solve both problems. “What we really set out to do is optimize every joule of energy, every capital cost of a transistor, and every opportunity for a single clock cycle, right? So it's been optimized along these three dimensions, energy, space and time,” IBM senior fellow Dharmendra S. Modha said in an interview.

The NorthPole chip has 22 billion transistors and 256 cores, according to the paper. There are, of course, chips with more transistors and more cores. But NorthPole’s unique architecture allows it to operate exponentially more efficiently on tasks like processing moving images. Against a comparable chip with “12nm silicon technology process node and with a comparable number of transistors, NorthPole delivers 25✕ higher frames/joule,” according to the paper. If you wanted to connect a lot of them in an enterprise cloud environment to run a generative AI program like ChatGPT, you could shrink that cloud down considerably. Cloud computing that used to take a massive building of servers suddenly fits in the back of a plane. But of course you also need fewer chips for things like small drones and robots.

Pentagon interest

IBM has been working on such neuromorphic chips for more than ten years, with funding from DARPA’s SyNAPSE program. The program spent more than $50 million between 2008 and 2014, and DOD has since 2019 invested another $90 million on the chips, a top Pentagon research official said.

“This is a prime example of what we would call patient capital,” Maynard Holliday, assistant defense secretary for critical technologies in the Office of the Under Secretary of Defense for Research and Engineering.

“For us, in the Department of Defense, we've always been looking for low…size, weight and power, and then increased speed for our processors. With the advent of generative AI we recognized that to do [computation] in a low-power fashion; we would need this kind of architecture…especially in a contested environment where our signals may be jammed. GPS may be denied, to be able to do compute at the tactical edge is an advantage.”

Smarter and network-independent chips could vastly improve the ability of various military systems—drones, ground robots, soldier headsets—to perceive and interpret the world around them. They could help ingest a wider variety of data, including audio, optical, infrared, sonar, and LiDAR; and enable the creation of new kinds of sensors, such as “micro power impulse radar,” Holliday said.
“It can do segmentation, which means it can discern, you know, people in a picture, it could classify sounds for you, again, all at the edge” without the help of the internet, he said. It could also revolutionize self-driving cars, not just for the military but in the commercial sector. “You can think about this from a vehicle standpoint [or from a dismounted soldier standpoint.”

Various military labs, including the Air Force Research Lab and Sandia National Lab, are already looking into uses for the prototype chip, Holliday said.
NorthPole may even enable the military to do more with fewer and more domestically producible chips. That’s a rising concern as more and more officials warn of a potential Chinese invasion of Taiwan, one of the main suppliers of advanced microprocessors for phones, cars, etc.

Holliday said NorthPole already rivals the most advanced chips out of Asia, and future versions are expected to be even more efficient.


“NorthPole is at 14 nanometers. You know, the state-of-the-art stuff that's in our iPhones and other commercial electronics is three nanometers. And that's produced all in Asia at TSMC [in Taiwan] and Samsung. And so the fact that this chip is performing the way it does at 14 nanometers bodes very well. As we descend that technology node curve to single digit nanometers, it's just going to be ever better performing.”

But the United States still has to heavily boost its ability to fabricate such chips in large quantities, he said, a process that’s barely begun.
 
  • Like
  • Fire
  • Thinking
Reactions: 14 users

Xray1

Regular
IBM came out of stealth mode today with NorthPole, an extension of TrueNorth…



19 Oct 2023
News
6 minute read

A new chip architecture points to faster, more energy-efficient AI​

A new chip prototype from IBM Research’s lab in California, long in the making, has the potential to upend how and where AI is used efficiently.

image

A new chip prototype from IBM Research’s lab in California, long in the making, has the potential to upend how and where AI is used efficiently.

We’re in the midst of a Cambrian explosion in AI. Over the last decade, AI has gone from theory and small tests to enterprise-scale use cases. But the hardware used to run AI systems, although increasingly powerful, was not designed with today’s AI in mind. As AI systems scale, the costs skyrocket. And Moore’s Law, the theory that the density of circuits in processors would double each year, has slowed.

But new research out of IBM Research’s lab in Almaden, California, nearly two decades in the making, has the potential to drastically shift how we can efficiently scale up powerful AI hardware systems.

Since the birth of the semiconductor industry, computer chips have primarily followed the same basic structure, where the processing units and the memory storing the information to be processed are stored discretely. While this structure has allowed for simpler designs that have been able to scale well over the decades, it’s created what’s called the von Neumann bottleneck, where it takes time and energy to continually shuffle data back and forth between memory, processing, and any other devices within a chip. The work by IBM Research’s Dharmendra Modha and his colleagues aims to change this, taking inspiration from how the brain computes. “It forges a completely different path from the von Neumann architecture,” according to Modha.

Over the last eight years, Modha has been working on a new type of digital AI chip for neural inference, which he calls NorthPole. It’s an extension of TrueNorth, the last brain-inspired chip that Modha worked on prior to 2014. In tests on the popular ResNet-50 image recognition and YOLOv4 object detection models, the new prototype device has demonstrated higher energy efficiency, higher space efficiency, and lower latency than any other chip currently on the market, and is roughly 4,000 times faster than TrueNorth.

The first promising set of results from NorthPole chips were published today in Science. NorthPole is a breakthrough in chip architecture that delivers massive improvements in energy, space, and time efficiencies, according to Modha.
Using the ResNet-50 model as a benchmark, NorthPole is considerably more efficient than common 12-nm GPUs and 14-nm CPUs. (NorthPole itself is built on 12 nm node processing technology.) In both cases, NorthPole is 25 times more energy efficient, when it comes to the number of frames interpreted per joule of power required. NorthPole also outperformed in latency, as well as space required to compute, in terms of frames interpreted per second per billion transistors required. According to Modha, on ResNet-50, NorthPole outperforms all major prevalent architectures — even those that use more advanced technology processes, such as a GPU implemented using a 4 nm process.

How does it manage to compute with so much efficiency than existing chips? One of the biggest differences with NorthPole is that all of the memory for the device is on the chip itself, rather than connected separately. Without that von Neumann bottleneck, the chip can carry out AI inferencing considerably faster than other chips already on the market. NorthPole was fabricated with a 12-nm node process, and contains 22 billion transistors in 800 square millimeters. It has 256 cores and can perform 2,048 operations per core per cycle at 8-bit precision, with potential to double and quadruple the number of operations with 4-bit and 2-bit precision, respectively. “It’s an entire network on a chip,” Modha said.

IBM_NP_PCIe-PCB-Rear.png
The NorthPole chip on a PCIe card.

Architecturally, NorthPole blurs the boundary between compute and memory,” Modha said. "At the level of individual cores, NorthPole appears as memory-near-compute and from outside the chip, at the level of input-output, it appears as an active memory.” This makes NorthPole easy to integrate in systems and significantly reduces load on the host machine.

But the biggest advantage of NorthPole is also a constraint: it can only easily pull from the memory it has onboard. All of the speedups that are possible on the chip would be undercut if it had to access information from another place.
Via an approach called scale-out, NorthPole can actually support larger neural networks by breaking them down into smaller sub-networks that fit within NorthPole’s model memory, and connecting these sub-networks together on multiple NorthPole chips. So while there is ample memory on a NorthPole (or collectively on a set of NorthPoles) for many of the models that would be useful for specific applications, this chip is not meant to be a jack of all trades. “We can’t run GPT-4 on this, but we could serve many of the models enterprises need,” Modha said . “And, of course, NorthPole is only for inferencing.”
This efficacy means that the device also doesn’t need bulky liquid-cooling systems to run — fans and heat sinks are more than enough — meaning that it could be deployed in some rather small spaces.


Potential applications for NorthPole​

While research into the NorthPole chip is still ongoing, its structure lends itself to emerging AI use cases, as well as more well-established ones.

In testing, NorthPole team focused primarily on computer vision-related uses, in part because funding for the project came from the U.S. Department of Defense. Some of the primary applications in consideration were detection, image segmentation, and video classification. But it was also tested in other arenas, such as natural language processing (on the encoder-only BERT model) and speech recognition (on the DeepSpeech2 model). The team is currently exploring mapping decoder-only large language models to NorthPole scale-out systems.

When you think of these AI tasks, all sorts of fantastical use cases spring to mind, from autonomous vehicles, to robotics, digital assistants, or spatial computing. Many sorts of edge applications that require massive amounts of data processing in real time could be well-suited for NorthPole. For example, it could potentially be the sort of device that’s needed to move autonomous vehicles from machines that require set maps and routes to operate on a small scale, to ones that can think and react to the rare edge-case situations that make navigating in the real world so challenging even for proficient human drivers. These sorts of edge-cases are the exact sweet spot for future NorthPole applications. NorthPole could enable satellites that monitor agriculture and manage wildlife populations, monitor vehicle and freight for safer and less congested roads, operate robots safely, and detect cyber threats for safer businesses.

What’s next

This is just the start of the work for Modha on NorthPole. The current state of the art for CPUs is 3 nm — and IBM itself is already years into research on 2 nm nodes. That means there’s a handful of generations of chip processing technologies NorthPole could be implemented on, in addition to fundamental architectural innovations, to keep finding efficiency and performance gains.

BIC-Group-Photo_2023-08-10_no-caption.png
Modha, center, with most of the team working on NorthPole.

But for Modha, this is just one important milestone along a continuum that has dominated the last 19 years of his professional career. He’s been working on digital brain-inspired chips throughout that time, knowing that the brain is the most energy-efficient processor we know, and searching for ways to replicate that digitally. TrueNorth was fully inspired by the structures of neurons in the brain — and had as many digital “synapses” in it as the brain of a bee. But sitting on a park bench in 2015 in San Francisco, Modha said he was thinking through his work to date. He had the belief that there was something in marrying the best of traditional processing devices with the structure of processing in the brain, where memory and processing are interspersed throughout the brain. The answer was “brain-inspired computing, with silicon speed,” according to Modha.

Over the next eight years, Modha and his colleagues were single-minded and hermetic in their goal of turning this vision into a reality. Toiling inconspicuously in Almaden, the team didn’t give any lectures or publish any papers on their work, until this year. Each person brought different skills and perspective yet everyone collaborated so that as a whole the team’s contribution was much greater than the sum of the parts. Now, the plan is to show what NorthPole could do, while exploring how to translate the designs into smaller chip production processes and further exploring the architectural possibilities.

This work stemmed from simple ideas — how can we make computers that work like the brain — and after years of fundamental research, has come up with an answer. Something that is really only possible today at a place like IBM Research, where there is the time and space to explore the big questions in computing, and where they can take us. “NorthPole is a faint representation of the brain in the mirror of a silicon wafer,” Modha said.


Here is a 61 page PDF file for the techies…





View attachment 47481
Is this a concern for BRN ????????? !!!!!!!!
 
  • Like
  • Haha
  • Sad
Reactions: 10 users

IloveLamp

Top 20
Last edited:
  • Haha
  • Sad
  • Like
Reactions: 14 users

Tels61

Member
Oh for goodness sake, the guy asks a simple relevent question, then get the usual flogging by mindless GIF's, come on
guys, instead of childish GIF's respond with some sensible answers.
 
  • Like
  • Love
  • Fire
Reactions: 49 users
  • Like
  • Fire
Reactions: 14 users

Xray1

Regular
Oh for goodness sake, the guy asks a simple relevent question, then get the usual flogging by mindless GIF's, come on
guys, instead of childish GIF's respond with some sensible answers.
Thanks Tels61 for your support... much appreciated.

I thought I hit by mistake the HotCrapper site with those stupid responses.
 
  • Like
  • Haha
  • Love
Reactions: 21 users
Oh for goodness sake, the guy asks a simple relevent question, then get the usual flogging by mindless GIF's, come on
guys, instead of childish GIF's respond with some sensible answers.
Yeah this place is becoming a shit show sadly...

Highly relevant question. How does this affect Akida, how does the fact that they have a better name recognition affect our product and standing in the market?

But sure belt anyone that questions anything here to death with GIFs...
 
  • Like
  • Love
  • Fire
Reactions: 31 users
Top Bottom