BRN Discussion Ongoing

Bravo

If ARM was an arm, BRN would be its biceps💪!
Hi Bravo,

Back in February, I posted the Snapdragon Hexagon spec sheet:

https://www.qualcomm.com/content/da...ocuments/Snapdragon-8-Gen-2-Product-Brief.pdf

Artificial Intelligence
Qualcomm® Adreno™ GPU Qualcomm® Kryo™ CPU Qualcomm® Hexagon™ Processor
• Fused AI Accelerator Architecture
• Hexagon Tensor Accelerator
• Hexagon Vector eXtensions
• Hexagon Scalar Accelerator
• Hexagon Direct Link
• Support for mix precision (INT8+INT16)
• Support for all precisions (INT4, INT8, INT16, FP16)
Micro Tile Inferencing Qualcomm® Sensing Hub
• Dual AI Processors for audio and sensors
• Always-Sensing camera


Our AI Engine includes the Qualcomm® Hexagon™ Processor, with revolutionary micro tile inferencing and faster Tensor accelerators for up to 4.35x1 faster AI performance than its predecessor. Plus, support for INT4 precision boosts performance-per-watt by 60% for sustained AI inferencing.


MicroTiles are part of transformers, such as ViT, which is said to be more efficient than LSTM.
I
Hi Bravo,

Back in February, I posted the Snapdragon Hexagon spec sheet:

https://www.qualcomm.com/content/da...ocuments/Snapdragon-8-Gen-2-Product-Brief.pdf

Artificial Intelligence
Qualcomm® Adreno™ GPU Qualcomm® Kryo™ CPU Qualcomm® Hexagon™ Processor
• Fused AI Accelerator Architecture
• Hexagon Tensor Accelerator
• Hexagon Vector eXtensions
• Hexagon Scalar Accelerator
• Hexagon Direct Link
• Support for mix precision (INT8+INT16)
• Support for all precisions (INT4, INT8, INT16, FP16)
Micro Tile Inferencing Qualcomm® Sensing Hub
• Dual AI Processors for audio and sensors
• Always-Sensing camera


Our AI Engine includes the Qualcomm® Hexagon™ Processor, with revolutionary micro tile inferencing and faster Tensor accelerators for up to 4.35x1 faster AI performance than its predecessor. Plus, support for INT4 precision boosts performance-per-watt by 60% for sustained AI inferencing.


MicroTiles are part of transformers, such as ViT, which is said to be more efficient than LSTM.

Hi Dodgy,

So, could it be feasible that Qualcomm was an early access partner of ours and they used our technology to inform them of their micro tile inferencing approach and furthermore, could the next gen Akida, with all it's ViT's TENN's and whistles be the product of Qualcomm's input as to suggested changes/enhancements that we are about to see rolling out shortly?
 
  • Like
  • Thinking
Reactions: 23 users

IloveLamp

Top 20
  • Like
  • Fire
Reactions: 8 users

IloveLamp

Top 20
Screenshot_20230729_140713_LinkedIn.jpg
 
  • Like
  • Fire
  • Love
Reactions: 42 users

IloveLamp

Top 20
Screenshot_20230729_153853_LinkedIn.jpg
Screenshot_20230729_153915_LinkedIn.jpg
 
  • Like
  • Fire
  • Love
Reactions: 33 users

Zedjack33

Regular
is it possible/plausible?

Apologies if already posted.

 

Attachments

  • 744F48D4-7C0A-4567-9564-44605F2B7A99.jpeg
    744F48D4-7C0A-4567-9564-44605F2B7A99.jpeg
    372.9 KB · Views: 125
  • Like
  • Thinking
Reactions: 8 users

manny100

Regular
View attachment 41030

To all the wonderful dot joiners out there.
God bless you and your little cotton sox.
🤣
Thank you all for sharing.
I do not dot join simply because i can never see the dots. I just see the big picture and BRN is on the cusp of huge growth.
Limitless opportunities.
With EV's coming in every car becomes a potential income earner for BRN. Loads of other opportunities out there. Almost everything we use will eventually contain a chip.
 
  • Like
  • Fire
  • Love
Reactions: 23 users

Cardpro

Regular
I


Hi Dodgy,

So, could it be feasible that Qualcomm was an early access partner of ours and they used our technology to inform them of their micro tile inferencing approach and furthermore, could the next gen Akida, with all it's ViT's TENN's and whistles be the product of Qualcomm's input as to suggested changes/enhancements that we are about to see rolling out shortly?
Given they recently formed a partnership with Prophesee, I am guessing they could benefit from Prophesee and hopefully with BrainChip :)

I hope Prophesee's key technology has been advanced due to Akida:)
 
  • Like
Reactions: 6 users
So we know that Akida is good for vibration & predictive maintenance analysis as well.

We know we work with Renesas and they own Reality AI.

Comments bottom of the post from Reality AI taken from the BRN website.

We also know Renesas work with vibration sensors.

Be so good if they decided to run some Sims with Akida as well in the POC or maybe wishful thinking :unsure:






Nidec and Renesas to develop semiconductor solutions for next-generation e-axle​

Callum Brook-Jones
By CALLUM BROOK-JONES5th June 20233 Mins Read
Share
AdobeStock_254162749-scaled-e1685980101105-2048x978-1024x489.jpg

Through a new collaboration, Nidec and Renesas Electronics will co-develop semiconductor solutions for a next-generation e-axle known as the X-in-1 system. The solution will integrate an electric vehicle drive motor and power electronics.

An ever-increasing number of EVs utilize 3-in-1 units/e-axles which integrate a motor, inverter and a gearbox (reduction gear). To achieve high levels of performance and efficiency from a more compact, lighter and cost-effective package, power electronics controls – including DC-DC converters and onboard chargers (OBCs) – are being integrated into EVs. In addition to these, certain manufacturers are developing X-in-1 platforms which integrate several functions to speed up adoption across differing model types.

X-in-1 systems are complex due to having several integrated functions. As a result, maintaining a high-level of quality can be challenging, meaning preventive safety technologies – including diagnostic functions and failure prediction systems – are vital to ensure security and safety.

To overcome such issues, Nidec’s motor technology will be combined with Renesas’s semiconductor technology to develop a proof of concept (PoC) for the X-in-1 system that benefits from a high reliability and performance. The first PoC is scheduled to launch by the end of 2023 and will feature a 6-in-1 system, consisting of a DC-DC converter, OBC and power distribution unit (PDU), in addition to a motor, inverter and gearbox.
Building on the first system, the partnership will work on the development of a highly integrated X-in-1 PoC in 2024.

This solution will include a battery management system (BMS) and additional components. The first PoC will feature power devices based on SiC (silicon carbide), while the second will replace the DC-DC and OBC power devices with GaN (gallium nitride).

“As we celebrate Nidec’s 50th anniversary, we take on a significant challenge of developing a world-class next-generation X-in-1 system, which goes back to our core principle of pioneering the world’s best innovations,” said Mitsuya Kishida, executive vice president and executive general manager of Automotive Motor & Electronic Control Business Unit at Nidec.

“By harnessing our strengths in automotive technology and developing PoCs together with Renesas, a leader in automotive semiconductor solutions, we aim to lead the market as a world-leading e-axle provider.”

“We are very pleased to announce our collaboration with Nidec, who has an exceptional track record of success in e-axle traction motors,” said Vivek Bhan, senior vice president, co-general manager of High Performance Computing, Analog and Power Solutions Group, Renesas.

“Our contribution to this collaboration extends beyond hardware design, encompassing software development which is critical to enabling rapid development of PoCs for our customers.”


BRN website.

“We see a growing number of predictive industrial (including HVAC, motor control) or automotive (including fleet maintenance), building automation, remote digital health equipment and other AIoT applications use complex models with minimal impact to product BOM and need faster real-time performance at the Edge” said Nalin Balan, Head of Business Development at Reality ai, a Renesas company. “BrainChip’s ability to efficiently handle streaming high frequency signal data, vision, and other advanced models at the edge can radically improve scale and timely delivery of intelligent services.”

Nalin Balan, Head of Business Development, Reality.ai, a Renesas Company
 
  • Like
  • Fire
  • Love
Reactions: 36 users

Tothemoon24

Top 20

The below is 4 months old & has most likely been posted .​



Brainchip is so beautifully position to be an incredibly successful company .

Inside Arm’s vision for the ‘software-defined vehicle’ of the future​

The chip giant is betting big on cars
April 11, 2023 - 12:33 pm

Inside Arm’s vision for the ‘software-defined vehicle’ of the future


The digitisation of cars has made comparisons to “data centres on wheels” so common that they’ve become clichéd. It’s also built a booming market for tech firms — few of which have capitalised as adeptly as Arm.
Often described as the UK’s leading IT company, SoftBank-owned Arm designs energy-efficient computer chips. The company’s architectures are found in endless applications, from smart cities to laptops, but they’re best-known for powering mobile devices. Around 95% of the world’s smartphones use Arm’s technology.
In recent years, however, the company’s fastest-growing division has been the automotive unit. Arm has reportedly more than doubled its revenues from the sector since 2020.
Dennis Laudick, Arm’s vice president of automotive go-to-market, attributes the growth to a convergence of three trends: electrification, automation, and in-vehicle user experience (UX).
Catch up on our conference talks
Watch videos of our past talks for free with TNW All Access →

Sign up
“All of those are driving more compute into the vehicle,” he says — and more compute means more business for Arm.
As the company prepares for a long-awaited public listing, Laudick gave TNW a glimpse into his automotive strategy.

Electric avenues​

Gradually, EVs are engulfing the car market. Last year, fully-electric vehicles comprised over 10% of car sales in Europe for the first time. Globally, their total sales hit around 7.8 million units — as much as 68% more than in 2022. To serve this growing market, automakers have to integrate a complex new collection of electronics.
“When you do that, it becomes a lot more complicated system,” says Laudick. “You need to look at even more electronics to manage it, and that causes people to rethink their architectures.”
The result is firmer foundations for more digital features. Take the all-electric Nissan Leaf, which runs Arm’s Cortex-R4 processor alongside an electric powertrain.
To control the power inverter, a microcomputer core has to accurately repeat a series of processes — such as sensing, calculation, and control output — for events that occur in 1/10,000-second cycles. In this tiny computation window, the system has to deliver efficient, responsive, and precise control.
By placing the battery and other heavy items close to the center of gravity, the yaw moment of inertia has been reduced compared to front-engine vehicles for improved stability and smoother cornering.
The Leaf’s battery has been moved closer to the car’s centre of gravity. According to Nissan, this provides better stability and cornering than front-engine vehicles. Credit: Nissan
The Leaf also has a new electronic pedal system, which the driver uses to control the car’s speed by applying pressure to the accelerator.
When the accelerator is fully released, regenerative and friction brakes are activated automatically, bringing the car to a complete stop — even on steep slopes — until the accelerator is pressed again. And if the driver gets tired, an intelligent cruise control system can automatically match the car’s speed to the flow of traffic, while a lane assist feature makes subtle steering corrections to keep the vehicle centred.
It’s a convenient package of features, but one that reimagines the whole foundations of a car. The likes of Nissan had spent decades establishing the controls and architectures that run internal combustion engines (ICEs) for decades. They’re now rapidly replacing their hardwarewith digital operations. The shift has fostered a concept called the “software-defined vehicle.”
“The whole industry is aware of this disruption that’s converting them from a mechanical mindset to a software mindset — and they’re all trying to reinvent themselves,” says Laudick.

“IT EQUATES TO MORE POWERFUL ELECTRONICS.

Undoubtedly, the transition has opened up new business opportunities for Original Equipment Manufacturers (OEMs), component suppliers, startups, and semiconductor companies. But all the new features and revenue streams have to fit within the tight constraints of power consumption, heat dissipation, and physical space.
That’s where Arm wants to step in. The company’s suite of processor IP, tools, and software solutions offers the automotive sector the promise of maximising innovation.
“From our perspective, it basically equates to more electronics — and more powerful electronics,” says Laudick.

Autonomy rules​

The transition to EVs has coincided with an expansion of autonomous features. While level 5 carshaven’t arrived as quickly as advertised, advanced driver assistance systems (ADAS), from lane detection to park assist, have become commonplace. As a result, the applications for Arm’s architectures are proliferating.
“The more autonomous functionality we drive into cars, the more exponential the compute demands are,” says Laudick. “And if you look at some of the data systems that people are looking at putting in cars in five years’ time, they’re really high-end.”
At present, Arm powers everything from processors that Dream Chip Technologies applies to radar to smart electronic fuses that Elmos uses to supply stable power. As the use cases expand, so does the demand for chips — and the rules that surround them.
EC President Ursula von der Leyen
The European Commission president, Ursula von der Leyen, has pushed to ban new combustion-engine cars. Credit: European Parliament
Both EVs and autonomous features are being pushed by regulators. Governments are encouraging electrification for environmental reasons, and autonomy for accident prevention.
In the EU, several safety features will soon become compulsory. The European Parliament has made measures including intelligent speed assistance (ISA), advanced emergency braking, and lane-keeping technology mandatory in new vehicles from May 2022.

“THIS WILL MAKE ALL OF US SAFER.

The lawmakers made a compelling case for their intervention. In 2018, around 25,100 people died on EU roads, while 135,000 were seriously injured. According to EU estimates, ISA alone could reduce the fatalities by 20%.
“ISA will provide a driver with feedback, based on maps and road sign observation, always when the speed limit is exceeded,” said MEP Róża Thun, who steered the legislation. “We do not introduce a speed limiter, but an intelligent system that will make drivers fully aware when they are speeding. This will not only make all of us safer, but also help drivers to avoid speeding tickets.”
It’s a similar story for electric vehicles. According to the European Commission, cars are responsible for 12% of total CO2 emissions in the EU. To mitigate the impact, the union recently approved a law requiring all new cars sold from 2035 to have zero CO2 emissions. In addition, already from 2030 their emissions must be 55% lower than they were in 2021.
The targets aim to accelerate electrification. In theory, this should benefit drivers, passengers, pedestrians — and Arm.

Getting flexible​

As automotive compute shifts from hardware to software, demand is growing for infotainment and cockpit features. According to Arm, more than 90% of in-vehicle infotainment (IVI) systems use the company’s chip designs. The architectures are also found in various under-the-hood applications, including meter clusters, e-mirrors, and heating, ventilation, and air conditioning (HVAC) control.
Munich-based automotive company Apostera aims to remove this disconnect between the real world and the infotainment system by transforming the windshield of a vehicle into a mixed reality screen.
Munich-based Apostera is using Arm’s designs to transform car windshields into mixed-reality screens.
The shift to the software-defined vehicle has also stimulated another IT feature: updates. Historically, vehicle software was not only rudimentary, but also fairly static. Today, that’s no longer the case.
“There’s an opportunity to continue to add to the functionality of the vehicle over its lifetime,” says Laudick.
An expanding range of features, from sensor algorithms to user interfaces, can now be enhanced over-the-air (OTA). As cars begin to resemble personal devices, consumers can expect a comparable update service. As Simon Humphries, the chief branding officer of Toyota, put it: “People want control over their own experiences.”
Laudick likens modern cars to platforms, upon which software and functionality can evolve.That’s an obvious magnet for Arm, whose products and processes are fundamentally about running software.
Carmakers are also becoming savvier about software. For example, General Motors’ self-driving unit, Cruise, is now developing its own computer chips for autonomous vehicles. The company has previously used Arm designs, but is now exploring an open-source architecture known as RISC-V — which is becoming a popular alternative. The instruction set’s low costs and flexibility have created a threat to Arm’s automotive ambitions.
“One executive I was talking to said: ‘The best negotiating strategy when Arm comes in is to have a RISC-V brochure sitting on my desk’,” Jim Feldhan, the president of semiconductor consultancy Semico Research, said last year. “It’s a threat. Arm is just not going to have its super dominant position in five or 20 years.”

“THERE’S BEEN A MOVE TO CREATE MORE FLEXIBILITY.

Currently, however, RISC-V could be regarded as riskier than Arm’s established standards. In a further challenge to RISC-V, Arm is gradually becoming more open. The Cortex-M processor series, for instance, now allows clients to add their own instructions, while extra configurability has been added to Arm software and tooling.
“We obviously try to control the products reasonably well, otherwise we just end up with a wild west. But there’s been a move in the company in the last several years to create more flexibility in certain areas,” says Laudick.
Mobileye, the Israeli self-driving unit of chip maker Intel
Mobileye, a self-driving unit of Intel that went public at $16.7 billion last year, is among a growing list of companies applying RISC-V architecture to vehicles. Credit: Mobileye
RISC-V is far from Arm’s only challenger. Established rivals such as Intel and Synopsys are also fighting for a chunk of the expanding market for automotive chips.
Nonetheless, Laudick is bullish about the future. He notes that today’s cars run about 100 million lines of software code, while a Boeing 787 is estimated to have “only” 14 million. By 2030, McKinsey predicts that vehicles will expand to roughly 300 million lines of code.
“I see the vehicle being, without doubt, the most complex software device you will own — if not that will exist,” says Laudick.
 
  • Like
  • Fire
Reactions: 25 users

Fenris78

Regular
The timing is quite interesting against our own tape out don't you think....?

I thought the same.... "General availability will follow in Q3’ 2023." Would be great if Intel were on the front foot to release their new chips with neural VPU's, to gain first to market advantage, prior to general availability.
 
  • Like
  • Fire
Reactions: 10 users

Lex555

Regular
Hi Bravo,

Back in February, I posted the Snapdragon Hexagon spec sheet:

https://www.qualcomm.com/content/da...ocuments/Snapdragon-8-Gen-2-Product-Brief.pdf

Artificial Intelligence
Qualcomm® Adreno™ GPU Qualcomm® Kryo™ CPU Qualcomm® Hexagon™ Processor
• Fused AI Accelerator Architecture
• Hexagon Tensor Accelerator
• Hexagon Vector eXtensions
• Hexagon Scalar Accelerator
• Hexagon Direct Link
• Support for mix precision (INT8+INT16)
• Support for all precisions (INT4, INT8, INT16, FP16)
Micro Tile Inferencing Qualcomm® Sensing Hub
• Dual AI Processors for audio and sensors
• Always-Sensing camera


Our AI Engine includes the Qualcomm® Hexagon™ Processor, with revolutionary micro tile inferencing and faster Tensor accelerators for up to 4.35x1 faster AI performance than its predecessor. Plus, support for INT4 precision boosts performance-per-watt by 60% for sustained AI inferencing.


MicroTiles are part of transformers, such as ViT, which is said to be more efficient than LSTM.
Thanks @Diogenese but as a layman I’m confused by your last paragraph, are you saying this spec sheet increases or decreases the likelihood of Akida being in Snapdragon?
 
  • Like
Reactions: 3 users

Diogenese

Top 20
I


Hi Dodgy,

So, could it be feasible that Qualcomm was an early access partner of ours and they used our technology to inform them of their micro tile inferencing approach and furthermore, could the next gen Akida, with all it's ViT's TENN's and whistles be the product of Qualcomm's input as to suggested changes/enhancements that we are about to see rolling out shortly?
Hi Bravo,

From my research into Qualcomm, I think that they have developed their own AI in-house, including the ViT.

I don't think that there is any relationship with BRN.
Thanks @Diogenese but as a layman I’m confused by your last paragraph, are you saying this spec sheet increases or decreases the likelihood of Akida being in Snapdragon?
 
  • Like
  • Sad
  • Love
Reactions: 24 users

IloveLamp

Top 20

"Those chips, which have appeared in laptops like the Samsung Galaxy Book3 Ultra, really haven’t been able to tap any “AI” functions besides Microsoft-driven enhancements to video calls placed with Windows, known as Windows Studio Effects. (AMD has a rival effort, known as Ryzen AI.) That, for now, has left any concept of an AI PC without much meaning, though Gelsinger clearly believes an AI renaissance is in the cards."

The answer, Gelsinger said, was to infuse local processors with AI instead. “We expect that Intel… is the one that’s going to truly democratize AI at the client and at the edge,” he said. “And we do believe that this will become a [market driver] because people will say oh, I want those new use cases. They make me more efficient and more capable, just like Centrino made me more efficient, because I didn’t have to plug into the wire.”

Otherwise, Gelsinger and Intel said that Intel remains on track with its existing process roadmap: Intel 7 is ramping now, Intel 4 will begin with Meteor Lake in the second half of 2023, and Intel 3, 20A, and the Intel 18A process are still on Intel’s previous schedule.
 
Last edited:
  • Like
  • Fire
  • Thinking
Reactions: 7 users

Getupthere

Regular


GenAI Breaks The Data Center (Part II): Moving GenAI To The Edge Through On-Device Computing


The rapid progress of Generative Artificial Intelligence (GenAI) has raised concerns about the sustainable economics of emerging GenAI services. Can Microsoft, Google, and Baidu offer chat responses to every search query made by billions of global smartphone and PC users? One possible resolution to this challenge is to perform a significant proportion of GenAI processing on edge devices, such as personal computers, tablets, smartphones, extended reality (XR) headsets, and eventually wearable devices.


The first article in this series (GenAI Breaks The Data Center: The Exponential Costs To Data Center) predicted that the processing requirements of GenAI including Large Language Models (LLMs) will increase exponentially through the end of the decade as rapid growth in users, usage, and applications drives data center growth. Tirias Research estimates that GenAI infrastructure and operating costs will exceed $76 billion by 2028. To improve the economics of emerging services, Tirias Research has identified four steps that can be taken to reduce operating costs. First, Usage steering to guide users to the most efficient computational option to accomplish their desired outcome. Model optimization to improve the efficiency of models employed by users at scale. Next, computational optimization to improve neural network computation through compression and advanced computer science techniques. Last, infrastructure optimization to cost-optimized data center architectures and offload GenAI workloads to edge devices. This framework can show how, at each step, optimization for client devices might occur.


Usage Steering


GenAI is able to perform creative and productive work. However, GenAI generates an entirely new burden on the cloud, and potentially client devices. At several points in the user journey, from research to the creation of a query or task, a service provider can steer users toward specialized neural networks for a more tailored experience. For GenAI, users can be steered toward models that are specifically trained on their desired outcome, allowing the use of specialized neural networks that contain fewer parameters compared with more general models. Further, models may be defined such that user queries can activate only a partial network, allowing the remainder of the neural network to remain inactive and not executed.


The edge, where users employ web-browsers, is a likely point of origin for user requests where an application or local service might capture a GenAI request and choose to execute it locally. This could include complex tasks, such as text generation; image and video generation, enhancement, or modification; audio creation or enhancement; and even code generation, review, or maintenance.


Model & Computational Optimization


While neural network models can be prototyped without optimization, the models we see deployed for millions of users will need to trade off both computational efficiency and accuracy. The typical scenario is that the larger the model, the more accurate the result but in many cases, the increase in accuracy comes at a high price with only minimal benefit. The size of the model is typically measured in parameters, where fewer parameters correspond linearly to the amount of time or computational resources required. If the number of parameters is halved while maintaining reasonable accuracy, users can run a model with half the number of accelerated servers and roughly half the total cost of ownership (TCO), which includes both amortized capital cost and operating costs. This includes models that may run multiple passes before generating a result.


Optimizing AI models is accomplished through quantization, pruning, knowledge distillation, and model specialization. Quantization essentially reduces the range of potential outcomes by limiting the number of potential values or outcomes to a defined set of values rather than allowing a potential infinite number of values. This is accomplished by representing the weights and activations with lower-precision data types including 4-bit or 8-bit integer (INT4 or INT8) instead of the standard high-precision 32-bit floating point (FP32) data type. Another way to reduce the size of a neural network is to prune the trained model of parameters that are redundant or unimportant. Typical compression targets range from 2X to 3X with nearly the same accuracy. Knowledge distillation uses a large, trained model to train a smaller model. A good example of this is the Vicuna-13B model which was trained from user-shared conversations with OpenAI’s GPT and fine-tuned on Facebook’s 65-billion parameter LLaMA model. A subset of knowledge distillation is model specialization, the development of smaller models for specific applications, such as using ChatGPT to answer only questions about literature, mathematics, or medical treatments rather than any generalized question. These optimization techniques can reduce the number of parameters dramatically. In forecasting the operating costs of GenAI, we take these factors into consideration, assuming that competitive and economic pressures push providers to highly optimized model deployments, reducing the anticipated capital and operating costs over time.


Infrastructure Optimization with On-Device GenAI


Improving the efficiency of GenAI models will not overcome the requirements of what Tirias Research believes will be necessary to support GenAI over just the next five years. Much of it will need to be performed on-device, which is often referred to as “Edge GenAI.” While Edge GenAI workloads seemed unlikely just months ago, up to 10-billion parameter models are increasingly widely viewed as candidates for the edge, operating on consumer devices utilizing model optimization and reasonable forecasts for increased device AI performance. For example, at Mobile World Congress earlier this year, Qualcomm demonstrated a Stable Diffusion model generating images on a smartphone powered by the company’s Snapdragon 8 Gen 2 processor. And recently, Qualcomm announced their intention to deliver large language models based on Meta’s LLaMA 2 on the Snapdragon platform in 2024. Similarly, GPU-accelerated consumer desktops can run the LLaMA 1 based Vicuna 13b model with 13 billion parameters, producing results similar, but of slightly lower quality, to GPT 3.5. Optimization will reduce the parameter count of these networks and thereby reduce the memory and processing requirements, placing them within the capacity of mainstream personal computing devices.


It's not difficult to imagine how GenAI, or any AI application, can move to a device like a PC, smartphone, or XR headset. The smartphone platform has already shown its ability to advance its processing, memory, and sensor technology so rapidly that in under a decade, smartphones replaced point-and-shoot cameras, consumer video cameras, DSLRs, and in some cases even professional cameras. The latest generation of smartphones can capture and process 4K video seamlessly, and in some cases even 8K video using AI-driven computational photo and video processing. All major smartphone brands already leverage AI technology for a variety of functions ranging from battery life and security to audio enhancement and computational photography. Additionally, AMD, Apple, Intel, and Qualcomm are incorporating inference accelerators into PC/Mac platforms. The same is true for almost all major consumer platforms and edge networking solutions. The challenge is matching the GenAI models to the processing capabilities of these edge AI processors.


While the performance improvements in mobile SoCs will not outpace the parameter growth of some GenAI applications like ChatGPT, Tirias Research believes that many GenAI models can be scaled for on-device processing. The size of the models that will be practical for on-device processing will increase over time. Note that the chart below assumes an average for on-device GenAI processing. In the development of the GenAI Forecast & TCO (Total Cost of Ownership) model, Tirias Research breaks out different classes of devices. Processing on device not only reduces latency, but it also addresses another growing concern – data privacy and security. By eliminating the interaction with the cloud, all data and the resulting GenAI results remain on the device.


Even with the potential for on-device processing, many models will exceed the processing capabilities for on-device processing and/or will require cloud interaction for a variety of reasons. GenAI applications that leverage a hybrid computing model might perform some processing on device and some in the cloud. One reason for hybrid GenAI processing might be the large size of the neural network model or the repetitive use of the model. By using hybrid computing, the device would process the sensor or input data and handle the smaller portions of the model while leaving the heavy lifting to the cloud. Image or video generation would be a good example where the initial layer or layers could be generated on devices. This could be the initial image, and then the enhanced image or the following images in a video could be generated by the cloud. Another reason might be the need for input from multiple sources, like generating updated maps in real-time. It would be more effective to use the information from multiple sources combined with the pre-existing models to effectively route vehicle traffic or network traffic. And in some cases, the model may be using data that is proprietary to a vendor, requiring some level of cloud processing to protect the data, such as for industrial or medical purposes. The need to use multiple GenAI models may also require hybrid computing because of the location or size of the models. Yet another reason might be the need for governance. While an on-device model may be able to generate a solution, there may still be the need to ensure that the solution does not violate legal or ethical guidelines, such as issues that have already arisen from GenAI solutions that infringe on copyrights, make up legal precedents, or tell consumers to do something that is beyond ethical boundaries.


The Impact of On-Device GenAI on Forecasted TCO


According to the Tirias Research GenAI Forecast and TCO Model, if 20% of GenAI processing workload could be offloaded from data centers by 2028 using on-device and hybrid processing, then the cost of data center infrastructure and operating cost for GenAI processing would decline by $15 billion. This also reduces the overall data center power requirements for GenAI applications by 800 megawatts. When factoring in the efficiencies of various forms of power generation, this results in a savings of approximately 2.4 million metric tons of coal, the reduction of 93 GE Halide 14MW wind turbines, or the elimination of several million solar panels plus and associated power storage capacity. Moving these models to devices or hybrid also reduces latency while increasing data privacy and security for a better user experience, factors that have been promoted for many consumer applications, not just AI.


While many are concerned about the rapid pace of GenAI and its impact on society, there are tremendous benefits, but the high-tech industry now finds itself in catchup mode to meet the astronomical demands of GenAI as the technology proliferates. This is similar to the introduction and growth of the internet, but on a much larger scale. Tirias Research believes that the limited forms of GenAI in use today, such as text-to-text, text-to-speech and text-to-image, will rapidly advance to video, games, and even metaverse generation starting within the next 18 to 24 months and further straining cloud resources.


C-Suite News, Analysis And Advice For Top Decisionmakers


Sign up for the Forbes CxO Newsletter sent every Monday.
 
  • Like
  • Love
  • Fire
Reactions: 15 users

Foxdog

Regular


GenAI Breaks The Data Center (Part II): Moving GenAI To The Edge Through On-Device Computing


The rapid progress of Generative Artificial Intelligence (GenAI) has raised concerns about the sustainable economics of emerging GenAI services. Can Microsoft, Google, and Baidu offer chat responses to every search query made by billions of global smartphone and PC users? One possible resolution to this challenge is to perform a significant proportion of GenAI processing on edge devices, such as personal computers, tablets, smartphones, extended reality (XR) headsets, and eventually wearable devices.


The first article in this series (GenAI Breaks The Data Center: The Exponential Costs To Data Center) predicted that the processing requirements of GenAI including Large Language Models (LLMs) will increase exponentially through the end of the decade as rapid growth in users, usage, and applications drives data center growth. Tirias Research estimates that GenAI infrastructure and operating costs will exceed $76 billion by 2028. To improve the economics of emerging services, Tirias Research has identified four steps that can be taken to reduce operating costs. First, Usage steering to guide users to the most efficient computational option to accomplish their desired outcome. Model optimization to improve the efficiency of models employed by users at scale. Next, computational optimization to improve neural network computation through compression and advanced computer science techniques. Last, infrastructure optimization to cost-optimized data center architectures and offload GenAI workloads to edge devices. This framework can show how, at each step, optimization for client devices might occur.


Usage Steering


GenAI is able to perform creative and productive work. However, GenAI generates an entirely new burden on the cloud, and potentially client devices. At several points in the user journey, from research to the creation of a query or task, a service provider can steer users toward specialized neural networks for a more tailored experience. For GenAI, users can be steered toward models that are specifically trained on their desired outcome, allowing the use of specialized neural networks that contain fewer parameters compared with more general models. Further, models may be defined such that user queries can activate only a partial network, allowing the remainder of the neural network to remain inactive and not executed.


The edge, where users employ web-browsers, is a likely point of origin for user requests where an application or local service might capture a GenAI request and choose to execute it locally. This could include complex tasks, such as text generation; image and video generation, enhancement, or modification; audio creation or enhancement; and even code generation, review, or maintenance.


Model & Computational Optimization


While neural network models can be prototyped without optimization, the models we see deployed for millions of users will need to trade off both computational efficiency and accuracy. The typical scenario is that the larger the model, the more accurate the result but in many cases, the increase in accuracy comes at a high price with only minimal benefit. The size of the model is typically measured in parameters, where fewer parameters correspond linearly to the amount of time or computational resources required. If the number of parameters is halved while maintaining reasonable accuracy, users can run a model with half the number of accelerated servers and roughly half the total cost of ownership (TCO), which includes both amortized capital cost and operating costs. This includes models that may run multiple passes before generating a result.


Optimizing AI models is accomplished through quantization, pruning, knowledge distillation, and model specialization. Quantization essentially reduces the range of potential outcomes by limiting the number of potential values or outcomes to a defined set of values rather than allowing a potential infinite number of values. This is accomplished by representing the weights and activations with lower-precision data types including 4-bit or 8-bit integer (INT4 or INT8) instead of the standard high-precision 32-bit floating point (FP32) data type. Another way to reduce the size of a neural network is to prune the trained model of parameters that are redundant or unimportant. Typical compression targets range from 2X to 3X with nearly the same accuracy. Knowledge distillation uses a large, trained model to train a smaller model. A good example of this is the Vicuna-13B model which was trained from user-shared conversations with OpenAI’s GPT and fine-tuned on Facebook’s 65-billion parameter LLaMA model. A subset of knowledge distillation is model specialization, the development of smaller models for specific applications, such as using ChatGPT to answer only questions about literature, mathematics, or medical treatments rather than any generalized question. These optimization techniques can reduce the number of parameters dramatically. In forecasting the operating costs of GenAI, we take these factors into consideration, assuming that competitive and economic pressures push providers to highly optimized model deployments, reducing the anticipated capital and operating costs over time.


Infrastructure Optimization with On-Device GenAI


Improving the efficiency of GenAI models will not overcome the requirements of what Tirias Research believes will be necessary to support GenAI over just the next five years. Much of it will need to be performed on-device, which is often referred to as “Edge GenAI.” While Edge GenAI workloads seemed unlikely just months ago, up to 10-billion parameter models are increasingly widely viewed as candidates for the edge, operating on consumer devices utilizing model optimization and reasonable forecasts for increased device AI performance. For example, at Mobile World Congress earlier this year, Qualcomm demonstrated a Stable Diffusion model generating images on a smartphone powered by the company’s Snapdragon 8 Gen 2 processor. And recently, Qualcomm announced their intention to deliver large language models based on Meta’s LLaMA 2 on the Snapdragon platform in 2024. Similarly, GPU-accelerated consumer desktops can run the LLaMA 1 based Vicuna 13b model with 13 billion parameters, producing results similar, but of slightly lower quality, to GPT 3.5. Optimization will reduce the parameter count of these networks and thereby reduce the memory and processing requirements, placing them within the capacity of mainstream personal computing devices.


It's not difficult to imagine how GenAI, or any AI application, can move to a device like a PC, smartphone, or XR headset. The smartphone platform has already shown its ability to advance its processing, memory, and sensor technology so rapidly that in under a decade, smartphones replaced point-and-shoot cameras, consumer video cameras, DSLRs, and in some cases even professional cameras. The latest generation of smartphones can capture and process 4K video seamlessly, and in some cases even 8K video using AI-driven computational photo and video processing. All major smartphone brands already leverage AI technology for a variety of functions ranging from battery life and security to audio enhancement and computational photography. Additionally, AMD, Apple, Intel, and Qualcomm are incorporating inference accelerators into PC/Mac platforms. The same is true for almost all major consumer platforms and edge networking solutions. The challenge is matching the GenAI models to the processing capabilities of these edge AI processors.


While the performance improvements in mobile SoCs will not outpace the parameter growth of some GenAI applications like ChatGPT, Tirias Research believes that many GenAI models can be scaled for on-device processing. The size of the models that will be practical for on-device processing will increase over time. Note that the chart below assumes an average for on-device GenAI processing. In the development of the GenAI Forecast & TCO (Total Cost of Ownership) model, Tirias Research breaks out different classes of devices. Processing on device not only reduces latency, but it also addresses another growing concern – data privacy and security. By eliminating the interaction with the cloud, all data and the resulting GenAI results remain on the device.


Even with the potential for on-device processing, many models will exceed the processing capabilities for on-device processing and/or will require cloud interaction for a variety of reasons. GenAI applications that leverage a hybrid computing model might perform some processing on device and some in the cloud. One reason for hybrid GenAI processing might be the large size of the neural network model or the repetitive use of the model. By using hybrid computing, the device would process the sensor or input data and handle the smaller portions of the model while leaving the heavy lifting to the cloud. Image or video generation would be a good example where the initial layer or layers could be generated on devices. This could be the initial image, and then the enhanced image or the following images in a video could be generated by the cloud. Another reason might be the need for input from multiple sources, like generating updated maps in real-time. It would be more effective to use the information from multiple sources combined with the pre-existing models to effectively route vehicle traffic or network traffic. And in some cases, the model may be using data that is proprietary to a vendor, requiring some level of cloud processing to protect the data, such as for industrial or medical purposes. The need to use multiple GenAI models may also require hybrid computing because of the location or size of the models. Yet another reason might be the need for governance. While an on-device model may be able to generate a solution, there may still be the need to ensure that the solution does not violate legal or ethical guidelines, such as issues that have already arisen from GenAI solutions that infringe on copyrights, make up legal precedents, or tell consumers to do something that is beyond ethical boundaries.


The Impact of On-Device GenAI on Forecasted TCO


According to the Tirias Research GenAI Forecast and TCO Model, if 20% of GenAI processing workload could be offloaded from data centers by 2028 using on-device and hybrid processing, then the cost of data center infrastructure and operating cost for GenAI processing would decline by $15 billion. This also reduces the overall data center power requirements for GenAI applications by 800 megawatts. When factoring in the efficiencies of various forms of power generation, this results in a savings of approximately 2.4 million metric tons of coal, the reduction of 93 GE Halide 14MW wind turbines, or the elimination of several million solar panels plus and associated power storage capacity. Moving these models to devices or hybrid also reduces latency while increasing data privacy and security for a better user experience, factors that have been promoted for many consumer applications, not just AI.


While many are concerned about the rapid pace of GenAI and its impact on society, there are tremendous benefits, but the high-tech industry now finds itself in catchup mode to meet the astronomical demands of GenAI as the technology proliferates. This is similar to the introduction and growth of the internet, but on a much larger scale. Tirias Research believes that the limited forms of GenAI in use today, such as text-to-text, text-to-speech and text-to-image, will rapidly advance to video, games, and even metaverse generation starting within the next 18 to 24 months and further straining cloud resources.


C-Suite News, Analysis And Advice For Top Decisionmakers


Sign up for the Forbes CxO Newsletter sent every Monday.
'but the high-tech industry now finds itself in catchup mode to meet the astronomical demands of GenAI as the technology proliferates'.

In this current environment it's difficult to imagine that AKIDA won't be in very high demand soon. I'm expecting big news from BRN prior to the end of this year. Surely some IP announcements are due soon and/or rapidly increasing revenue shown in the next 4C.
 
  • Like
  • Love
  • Fire
Reactions: 24 users

Getupthere

Regular
Interest rates will start dropping first half of next year and technology companies on the nasdaq will start to boom again.

Big tech companies will be ready and I’m tipping the last quarter of 2023 is going to be very interested.

All in my opinion
 
  • Like
  • Love
  • Fire
Reactions: 24 users

HopalongPetrovski

I'm Spartacus!


GenAI Breaks The Data Center (Part II): Moving GenAI To The Edge Through On-Device Computing


The rapid progress of Generative Artificial Intelligence (GenAI) has raised concerns about the sustainable economics of emerging GenAI services. Can Microsoft, Google, and Baidu offer chat responses to every search query made by billions of global smartphone and PC users? One possible resolution to this challenge is to perform a significant proportion of GenAI processing on edge devices, such as personal computers, tablets, smartphones, extended reality (XR) headsets, and eventually wearable devices.


The first article in this series (GenAI Breaks The Data Center: The Exponential Costs To Data Center) predicted that the processing requirements of GenAI including Large Language Models (LLMs) will increase exponentially through the end of the decade as rapid growth in users, usage, and applications drives data center growth. Tirias Research estimates that GenAI infrastructure and operating costs will exceed $76 billion by 2028. To improve the economics of emerging services, Tirias Research has identified four steps that can be taken to reduce operating costs. First, Usage steering to guide users to the most efficient computational option to accomplish their desired outcome. Model optimization to improve the efficiency of models employed by users at scale. Next, computational optimization to improve neural network computation through compression and advanced computer science techniques. Last, infrastructure optimization to cost-optimized data center architectures and offload GenAI workloads to edge devices. This framework can show how, at each step, optimization for client devices might occur.


Usage Steering


GenAI is able to perform creative and productive work. However, GenAI generates an entirely new burden on the cloud, and potentially client devices. At several points in the user journey, from research to the creation of a query or task, a service provider can steer users toward specialized neural networks for a more tailored experience. For GenAI, users can be steered toward models that are specifically trained on their desired outcome, allowing the use of specialized neural networks that contain fewer parameters compared with more general models. Further, models may be defined such that user queries can activate only a partial network, allowing the remainder of the neural network to remain inactive and not executed.


The edge, where users employ web-browsers, is a likely point of origin for user requests where an application or local service might capture a GenAI request and choose to execute it locally. This could include complex tasks, such as text generation; image and video generation, enhancement, or modification; audio creation or enhancement; and even code generation, review, or maintenance.


Model & Computational Optimization


While neural network models can be prototyped without optimization, the models we see deployed for millions of users will need to trade off both computational efficiency and accuracy. The typical scenario is that the larger the model, the more accurate the result but in many cases, the increase in accuracy comes at a high price with only minimal benefit. The size of the model is typically measured in parameters, where fewer parameters correspond linearly to the amount of time or computational resources required. If the number of parameters is halved while maintaining reasonable accuracy, users can run a model with half the number of accelerated servers and roughly half the total cost of ownership (TCO), which includes both amortized capital cost and operating costs. This includes models that may run multiple passes before generating a result.


Optimizing AI models is accomplished through quantization, pruning, knowledge distillation, and model specialization. Quantization essentially reduces the range of potential outcomes by limiting the number of potential values or outcomes to a defined set of values rather than allowing a potential infinite number of values. This is accomplished by representing the weights and activations with lower-precision data types including 4-bit or 8-bit integer (INT4 or INT8) instead of the standard high-precision 32-bit floating point (FP32) data type. Another way to reduce the size of a neural network is to prune the trained model of parameters that are redundant or unimportant. Typical compression targets range from 2X to 3X with nearly the same accuracy. Knowledge distillation uses a large, trained model to train a smaller model. A good example of this is the Vicuna-13B model which was trained from user-shared conversations with OpenAI’s GPT and fine-tuned on Facebook’s 65-billion parameter LLaMA model. A subset of knowledge distillation is model specialization, the development of smaller models for specific applications, such as using ChatGPT to answer only questions about literature, mathematics, or medical treatments rather than any generalized question. These optimization techniques can reduce the number of parameters dramatically. In forecasting the operating costs of GenAI, we take these factors into consideration, assuming that competitive and economic pressures push providers to highly optimized model deployments, reducing the anticipated capital and operating costs over time.


Infrastructure Optimization with On-Device GenAI


Improving the efficiency of GenAI models will not overcome the requirements of what Tirias Research believes will be necessary to support GenAI over just the next five years. Much of it will need to be performed on-device, which is often referred to as “Edge GenAI.” While Edge GenAI workloads seemed unlikely just months ago, up to 10-billion parameter models are increasingly widely viewed as candidates for the edge, operating on consumer devices utilizing model optimization and reasonable forecasts for increased device AI performance. For example, at Mobile World Congress earlier this year, Qualcomm demonstrated a Stable Diffusion model generating images on a smartphone powered by the company’s Snapdragon 8 Gen 2 processor. And recently, Qualcomm announced their intention to deliver large language models based on Meta’s LLaMA 2 on the Snapdragon platform in 2024. Similarly, GPU-accelerated consumer desktops can run the LLaMA 1 based Vicuna 13b model with 13 billion parameters, producing results similar, but of slightly lower quality, to GPT 3.5. Optimization will reduce the parameter count of these networks and thereby reduce the memory and processing requirements, placing them within the capacity of mainstream personal computing devices.


It's not difficult to imagine how GenAI, or any AI application, can move to a device like a PC, smartphone, or XR headset. The smartphone platform has already shown its ability to advance its processing, memory, and sensor technology so rapidly that in under a decade, smartphones replaced point-and-shoot cameras, consumer video cameras, DSLRs, and in some cases even professional cameras. The latest generation of smartphones can capture and process 4K video seamlessly, and in some cases even 8K video using AI-driven computational photo and video processing. All major smartphone brands already leverage AI technology for a variety of functions ranging from battery life and security to audio enhancement and computational photography. Additionally, AMD, Apple, Intel, and Qualcomm are incorporating inference accelerators into PC/Mac platforms. The same is true for almost all major consumer platforms and edge networking solutions. The challenge is matching the GenAI models to the processing capabilities of these edge AI processors.


While the performance improvements in mobile SoCs will not outpace the parameter growth of some GenAI applications like ChatGPT, Tirias Research believes that many GenAI models can be scaled for on-device processing. The size of the models that will be practical for on-device processing will increase over time. Note that the chart below assumes an average for on-device GenAI processing. In the development of the GenAI Forecast & TCO (Total Cost of Ownership) model, Tirias Research breaks out different classes of devices. Processing on device not only reduces latency, but it also addresses another growing concern – data privacy and security. By eliminating the interaction with the cloud, all data and the resulting GenAI results remain on the device.


Even with the potential for on-device processing, many models will exceed the processing capabilities for on-device processing and/or will require cloud interaction for a variety of reasons. GenAI applications that leverage a hybrid computing model might perform some processing on device and some in the cloud. One reason for hybrid GenAI processing might be the large size of the neural network model or the repetitive use of the model. By using hybrid computing, the device would process the sensor or input data and handle the smaller portions of the model while leaving the heavy lifting to the cloud. Image or video generation would be a good example where the initial layer or layers could be generated on devices. This could be the initial image, and then the enhanced image or the following images in a video could be generated by the cloud. Another reason might be the need for input from multiple sources, like generating updated maps in real-time. It would be more effective to use the information from multiple sources combined with the pre-existing models to effectively route vehicle traffic or network traffic. And in some cases, the model may be using data that is proprietary to a vendor, requiring some level of cloud processing to protect the data, such as for industrial or medical purposes. The need to use multiple GenAI models may also require hybrid computing because of the location or size of the models. Yet another reason might be the need for governance. While an on-device model may be able to generate a solution, there may still be the need to ensure that the solution does not violate legal or ethical guidelines, such as issues that have already arisen from GenAI solutions that infringe on copyrights, make up legal precedents, or tell consumers to do something that is beyond ethical boundaries.


The Impact of On-Device GenAI on Forecasted TCO


According to the Tirias Research GenAI Forecast and TCO Model, if 20% of GenAI processing workload could be offloaded from data centers by 2028 using on-device and hybrid processing, then the cost of data center infrastructure and operating cost for GenAI processing would decline by $15 billion. This also reduces the overall data center power requirements for GenAI applications by 800 megawatts. When factoring in the efficiencies of various forms of power generation, this results in a savings of approximately 2.4 million metric tons of coal, the reduction of 93 GE Halide 14MW wind turbines, or the elimination of several million solar panels plus and associated power storage capacity. Moving these models to devices or hybrid also reduces latency while increasing data privacy and security for a better user experience, factors that have been promoted for many consumer applications, not just AI.


While many are concerned about the rapid pace of GenAI and its impact on society, there are tremendous benefits, but the high-tech industry now finds itself in catchup mode to meet the astronomical demands of GenAI as the technology proliferates. This is similar to the introduction and growth of the internet, but on a much larger scale. Tirias Research believes that the limited forms of GenAI in use today, such as text-to-text, text-to-speech and text-to-image, will rapidly advance to video, games, and even metaverse generation starting within the next 18 to 24 months and further straining cloud resources.


C-Suite News, Analysis And Advice For Top Decisionmakers


Sign up for the Forbes CxO Newsletter sent every Monday.

PVDM was talking about the guts of this article years ago, and whilst at the time I thought it to be vaguely interesting and important, it has proven to be prophetic.
Given we have apparently already transitioned to an 'era of global boiling' I hope they can achieve the accelerated implementation before all our brains are cooked.

This...............

"While many are concerned about the rapid pace of GenAI and its impact on society, there are tremendous benefits, but the high-tech industry now finds itself in catchup mode to meet the astronomical demands of GenAI as the technology proliferates. This is similar to the introduction and growth of the internet, but on a much larger scale. Tirias Research believes that the limited forms of GenAI in use today, such as text-to-text, text-to-speech and text-to-image, will rapidly advance to video, games, and even metaverse generation starting within the next 18 to 24 months and further straining cloud resources."

The scale being talked about here and the upcoming rapidity of its take-up is staggering.
We are fortunate indeed to have had the foresight to invest in a company that is acting in this very area and has a solution ready and available to answer this need.
 
  • Like
  • Love
  • Fire
Reactions: 60 users

Teach22

Regular
Hi Bravo,

From my research into Qualcomm, I think that they have developed their own AI in-house, including the ViT.

I don't think that there is any relationship with BRN.
Doesn't matter how many times you (and others) say it. When someone doesn’t want to hear it, you will never change their mind.
 
  • Like
Reactions: 3 users

Bravo

If ARM was an arm, BRN would be its biceps💪!
Doesn't matter how many times you (and others) say it. When someone doesn’t want to hear it, you will never change their mind.
Sorry to contradict you, but I'm actually quite capable of changing my mind when presented with credible information such as the likes of which Dodgy-Knees generously provides.
 
  • Like
  • Love
  • Fire
Reactions: 41 users

Pmel

Regular
Hi Bravo,

From my research into Qualcomm, I think that they have developed their own AI in-house, including the ViT.

I don't think that there is any relationship with BRN.
Hopefully its not bad for brainchip?
 
Top Bottom