BRN Discussion Ongoing

supersonic001 · Jan 30, 2025

DeepSeek R1's breakthrough results are a boon for BrainChip and the extreme-edge AI industry. | M Anthony Lewis

DeepSeek R1's breakthrough results are a boon for BrainChip and the extreme-edge AI industry. Here's how it will impact inference at the extreme edge: DeepSeek has shattered the "No one ever got fired for going with IBM" mentality, liberating customers from the behemoths' models. This paves the...

www.linkedin.com

Diogenese · Jan 30, 2025

Tothemoon24 said:
The good Doctor is an absolute ️

View attachment 76950

DeepSeek R1's breakthrough results are a boon for BrainChip and the extreme-edge AI industry. Here's how it will impact inference at the extreme edge:

DeepSeek has shattered the "No one ever got fired for going with IBM" mentality, liberating customers from the behemoths' models. This paves the way for a Cambrian explosion of specialized LLMs delivered by smaller companies. It's akin to breaking the 4-minute mile – now that it's been done, we're poised to witness an explosion of innovation. This is excellent news for smaller companies like BrainChip.

Now imagine AI models running on whispers of energy, delivering powerful performance without massive data centers or cloud connectivity. This is the promise of extreme-edge AI, which is BrainChip's sweet spot.

While DeepSeek is an impressive transformer based model, TENNs technology, based on State Space Models (SSMs), currently offers a superior solution for extreme-edge applications:

1. No explosive KV cache: Unlike traditional transformer models that rely on rapidly expanding key-value (KV) caches, TENNs sidesteps this issue with fixed memory requirements, regardless of input length. This fundamental property enables LLMs at the extreme edge.

2. Competitive Performancereliminary experiments pitting the DeepSeek 1.5B model against TENNs 1.2B model in various tasks, such as trip planning and simple programming, showed comparable or slightly better results for TENNs.

3. Extremely Low Training Costs: BrainChip's focus on small specialized models means training costs are less than a premium economy flight from Los Angeles to Sydney.

DeepSeek's success highlights areas where TENNs can be further enhanced. We can leverage many of the tricks learned from DeepSeek to improve their TENNs LLM even more.

The future of extreme-edge AI is bright, with DeepSeek demonstrating that small companies can compete effectively in this space.

"equivalent to breaking the 4 minute mile" = Nvidia sliding down the Bannister, while Akida takes the express lift (= elevator in USese) to the top Landying.

Diogenese · Jan 30, 2025

Tothemoon24 said:
The good Doctor is an absolute ️

View attachment 76950

DeepSeek R1's breakthrough results are a boon for BrainChip and the extreme-edge AI industry. Here's how it will impact inference at the extreme edge:

DeepSeek has shattered the "No one ever got fired for going with IBM" mentality, liberating customers from the behemoths' models. This paves the way for a Cambrian explosion of specialized LLMs delivered by smaller companies. It's akin to breaking the 4-minute mile – now that it's been done, we're poised to witness an explosion of innovation. This is excellent news for smaller companies like BrainChip.

Now imagine AI models running on whispers of energy, delivering powerful performance without massive data centers or cloud connectivity. This is the promise of extreme-edge AI, which is BrainChip's sweet spot.

While DeepSeek is an impressive transformer based model, TENNs technology, based on State Space Models (SSMs), currently offers a superior solution for extreme-edge applications:

1. No explosive KV cache: Unlike traditional transformer models that rely on rapidly expanding key-value (KV) caches, TENNs sidesteps this issue with fixed memory requirements, regardless of input length. This fundamental property enables LLMs at the extreme edge.

2. Competitive Performancereliminary experiments pitting the DeepSeek 1.5B model against TENNs 1.2B model in various tasks, such as trip planning and simple programming, showed comparable or slightly better results for TENNs.

3. Extremely Low Training Costs: BrainChip's focus on small specialized models means training costs are less than a premium economy flight from Los Angeles to Sydney.

DeepSeek's success highlights areas where TENNs can be further enhanced. We can leverage many of the tricks learned from DeepSeek to improve their TENNs LLM even more.

The future of extreme-edge AI is bright, with DeepSeek demonstrating that small companies can compete effectively in this space.

... and I've still got 2 wishes left!

supersonic001 · Jan 30, 2025

First port of call for Utae is to conduct internal audits on the whereabouts of Brainchip revenue landing date ……!

Diogenese · Jan 30, 2025

DingoBorat said:
The A.I. Model wars are intensifying, with 2 new Chinese ones introduced, in the last several hours.

One by Alibaba no less..

Thanks DB,

I mistook the Alibaba patent for DeepSeeks because it had a "Wenfeng" as inventor.

CN118798303A Large language model training method, question and answer method, equipment, medium and productPatent Translate 20240913

ALIYUN FEITIAN HANGZHOU CLOUD COMPUTING TECH CO LTD

Inventors FENG WENFENG; ZHANG YUEWEI; ZENG ZHENYU

The invention provides a large language model training method, a question and answer method, equipment, a medium and a product, and relates to the technical field of artificial intelligence, the training method comprises the following steps: obtaining long text training data, the sequence length of the long text training data being greater than the maximum length of an input text sequence of a pre-trained large language model; increasing a rotation angle base number of a rotation position code of the pre-trained large language model to obtain a modified pre-trained large language model; and training the modified pre-trained large language model by using the long text training data to obtain a trained large language model. In the embodiment, the pre-trained large language model is trained by acquiring the long text training data and increasing the base number of the rotation angles of the rotation position codes, so that the length of the input text sequence is amplified, and the trained large language model can process the long text sequence; and the answer integrity and accuracy of the large language model on questions dependent on long texts and multi-document comparison are improved.

Diogenese · Jan 30, 2025

Terroni2105 said:
I agree Dio, but I think Sean reiterated again in that recent podcast interview he did that we are (still) an IP company. Someone correct me if my memory is not serving me well (I couldn’t be bothered listening to it again )

Yes - he did say that, but there are several Akida 1 SoC products.

PCIe, M2, Raspberry Pi, Edge Box, Edge Server, Bascom Hunter 3U VPX SNAP card, ...

I wonder where the chips are coming from?

rgupta · Jan 30, 2025

Diogenese said:
Yes - he did say that, but there are several Akida 1 SoC products.

PCIe, M2, Raspberry Pi, Edge Box, Edge Server, Bascom Hunter 3U VPX SNAP card, ...

I wonder where the chips are coming from?

I wonder the where the products are selling, there is no sales revenue either.
Jokes apart I assume akida 1000 was fabricated based on demand related orders and there is possibility company can order more chips from tsmc.

AusEire · Jan 30, 2025

HopalongPetrovski said:
No, think it's where he keeps his emergency diaper.
C'mon BrainChip, whilst we can still get it up!

AusEire · Jan 30, 2025

IloveLamp said:
That's what i thought i knew but wanted to hear. You're a keeper Tony, buy that man a beer.

Bring on 2025, the instos are coming for your shares MAKE THEM PAY THROUGH THE NOSE.

imo dyor

Can we ensure it's not VB?

JDelekto · Jan 30, 2025

Tothemoon24 said:
The good Doctor is an absolute ️

View attachment 76954

DeepSeek R1's breakthrough results are a boon for BrainChip and the extreme-edge AI industry. Here's how it will impact inference at the extreme edge:

DeepSeek has shattered the "No one ever got fired for going with IBM" mentality, liberating customers from the behemoths' models. This paves the way for a Cambrian explosion of specialized LLMs delivered by smaller companies. It's akin to breaking the 4-minute mile – now that it's been done, we're poised to witness an explosion of innovation. This is excellent news for smaller companies like BrainChip.

Now imagine AI models running on whispers of energy, delivering powerful performance without massive data centers or cloud connectivity. This is the promise of extreme-edge AI, which is BrainChip's sweet spot.

While DeepSeek is an impressive transformer based model, TENNs technology, based on State Space Models (SSMs), currently offers a superior solution for extreme-edge applications:

1. No explosive KV cache: Unlike traditional transformer models that rely on rapidly expanding key-value (KV) caches, TENNs sidesteps this issue with fixed memory requirements, regardless of input length. This fundamental property enables LLMs at the extreme edge.

2. Competitive Performancereliminary experiments pitting the DeepSeek 1.5B model against TENNs 1.2B model in various tasks, such as trip planning and simple programming, showed comparable or slightly better results for TENNs.

3. Extremely Low Training Costs: BrainChip's focus on small specialized models means training costs are less than a premium economy flight from Los Angeles to Sydney.

DeepSeek's success highlights areas where TENNs can be further enhanced. We can leverage many of the tricks learned from DeepSeek to improve their TENNs LLM even more.

The future of extreme-edge AI is bright, with DeepSeek demonstrating that small companies can compete effectively in this space.

This has me excited:

1. No explosive KV cache: Unlike traditional transformer models that rely on rapidly expanding key-value (KV) caches, TENNs sidesteps this issue with fixed memory requirements, regardless of input length. This fundamental property enables LLMs at the extreme edge.

If I understand this correctly it seems to have a fixed memory requirement regardless of the context length. I notice that when using local inferencing with models that support up to a 128k context window, the smaller the context window size I choose, the less memory it consumes (in addition to the memory used by the model data itself).

Removing the context window altogether is like the holy grail for LLMs. You could feed it volumes of information to give it more and more context for a more accurate response to your query.

To put it into perspective, there are about 0.75 tokens per word, with around 96,000 words in a 128k context window. Average novels run around 60,000 to 100,000 words, with "Harry Potter and the Sorcerer's Stone" with about 77,000 words.

Probably an oversimplification, but imagine stuffing a model up front with the Harry Potter novel and being able to ask it to summarize, look for specific parts of the text, create character biographies, write a new imaginative piece using the same writing style, isolate heroes vs. villains, etc.

Now an alternate route, like filling up context for a programming language, examples, the documentation for some 3rd party APIs, and then asking the model to write code to solve a specific problem using all those tools.

If they can do extreme RAG (Retrieval Augment Generation, where current data is retrieved and used to provide the model context) in memory-constrained Edge cases, that is a boon for TENNS. An additional advantage when performing accurate and up-to-date inferencing in conjunction with Akida's ability to update its model as it learns from new sensor input.

Schwale · Jan 30, 2025

JDelekto said:
This has me excited:

1. No explosive KV cache: Unlike traditional transformer models that rely on rapidly expanding key-value (KV) caches, TENNs sidesteps this issue with fixed memory requirements, regardless of input length. This fundamental property enables LLMs at the extreme edge.

If I understand this correctly it seems to have a fixed memory requirement regardless of the context length. I notice that when using local inferencing with models that support up to a 128k context window, the smaller the context window size I choose, the less memory it consumes (in addition to the memory used by the model data itself).

Removing the context window altogether is like the holy grail for LLMs. You could feed it volumes of information to give it more and more context for a more accurate response to your query.

To put it into perspective, there are about 0.75 tokens per word, with around 96,000 words in a 128k context window. Average novels run around 60,000 to 100,000 words, with "Harry Potter and the Sorcerer's Stone" with about 77,000 words.

Probably an oversimplification, but imagine stuffing a model up front with the Harry Potter novel and being able to ask it to summarize, look for specific parts of the text, create character biographies, write a new imaginative piece using the same writing style, isolate heroes vs. villains, etc.

Now an alternate route, like filling up context for a programming language, examples, the documentation for some 3rd party APIs, and then asking the model to write code to solve a specific problem using all those tools.

If they can do extreme RAG (Retrieval Augment Generation, where current data is retrieved and used to provide the model context) in memory-constrained Edge cases, that is a boon for TENNS. An additional advantage when performing accurate and up-to-date inferencing in conjunction with Akida's ability to update its model as it learns from new sensor input.

Thank you. This is an extremely informative post. It summarises how the process works and how components feed into one another.

Humble Genius · Jan 30, 2025

JDelekto said:
This has me excited:

1. No explosive KV cache: Unlike traditional transformer models that rely on rapidly expanding key-value (KV) caches, TENNs sidesteps this issue with fixed memory requirements, regardless of input length. This fundamental property enables LLMs at the extreme edge.

If I understand this correctly it seems to have a fixed memory requirement regardless of the context length. I notice that when using local inferencing with models that support up to a 128k context window, the smaller the context window size I choose, the less memory it consumes (in addition to the memory used by the model data itself).

Removing the context window altogether is like the holy grail for LLMs. You could feed it volumes of information to give it more and more context for a more accurate response to your query.

To put it into perspective, there are about 0.75 tokens per word, with around 96,000 words in a 128k context window. Average novels run around 60,000 to 100,000 words, with "Harry Potter and the Sorcerer's Stone" with about 77,000 words.

Probably an oversimplification, but imagine stuffing a model up front with the Harry Potter novel and being able to ask it to summarize, look for specific parts of the text, create character biographies, write a new imaginative piece using the same writing style, isolate heroes vs. villains, etc.

Now an alternate route, like filling up context for a programming language, examples, the documentation for some 3rd party APIs, and then asking the model to write code to solve a specific problem using all those tools.

If they can do extreme RAG (Retrieval Augment Generation, where current data is retrieved and used to provide the model context) in memory-constrained Edge cases, that is a boon for TENNS. An additional advantage when performing accurate and up-to-date inferencing in conjunction with Akida's ability to update its model as it learns from new sensor input.

The only thing that would get me excited now is Brainchip actually selling their product and making some explosive revenue. NVIDIA would never have thought something like this could've happened, but it did. Imagine waking up one morning and finding out that the PRC have managed to develop and sell some type of device with on chip learning, independant of the cloud. Don't think because its Chinese people won't buy it and its at the right price..... also no secutity concern because it doesn't need to be connected to the cloud....
Tony needs to give Sean and the rest of the selling machine a good kick up the arse.

charles2 · Jan 30, 2025

supersonic001 said:
DeepSeek R1's breakthrough results are a boon for BrainChip and the extreme-edge AI industry. | M Anthony Lewis

DeepSeek R1's breakthrough results are a boon for BrainChip and the extreme-edge AI industry. Here's how it will impact inference at the extreme edge: DeepSeek has shattered the "No one ever got fired for going with IBM" mentality, liberating customers from the behemoths' models. This paves the...

www.linkedin.com

View attachment 76951

This rather eloquent viewpoint or a version thereof, should be a news release from Brainchip touting not only our product but our leadership.

Time to stop playing nice

Rach2512 · Jan 30, 2025

Optical Character Recognition (OCR) is revolutionizing the automotive industry! | MosChip®

Optical Character Recognition (OCR) is revolutionizing the automotive industry! From license plate recognition to driver assistance, OCR enhances safety & efficiency. Read how MosChip® is driving innovation! 👉 https://lnkd.in/gEW8Rrfy #AI #AutomotiveTech #OCR Srinivasa Kakumanu Jayaram...

www.linkedin.com

Hrdwk · Jan 31, 2025

This might sound stupid, But I wonder if they would consider doing what DeepSeek did to ChatGPT to DeepSeek with Akida?

uiux · Jan 31, 2025

https://github.com/Brainchip-Inc/akida_examples/releases/tag/2.11.0-doc-1

Latest version of Akida docs states "Aligned with FPGA 1562(2-node)/1563(6-node)/1532(ViT)"

Indicates three upcoming FPGAs:

FPGA 1562(2-node)
1563(6-node)
1532(ViT)

Fullmoonfever · Jan 31, 2025

Haven't had much time this week & just skim reading posts where I can.

Just ran an ASX Broker data on BRN to go with a couple I saved the other week to see who's doing what.

Not gone through properly myself yet but thought I'd post them here fwiw as about to shutdown lappy.

One I hadn't seen before was in the first date range, Instinet Aust (part of the Nomura Group it appears) who bought $440k worth for someone(s).

Newest date range to oldest covering ~ 3 days ago (T+3 data) back to 1/12.

Guzzi62 · Jan 31, 2025

uiux said:
https://github.com/Brainchip-Inc/akida_examples/releases/tag/2.11.0-doc-1

Latest version of Akida docs states "Aligned with FPGA 1562(2-node)/1563(6-node)/1532(ViT)"

Indicates three upcoming FPGAs:

FPGA 1562(2-node)

1563(6-node)

1532(ViT)

Care to explain what that mean?

Software update right?

But what that mean practically I have no idea, guess you can compare it to all your apps on your phone getting updated?

uiux · Jan 31, 2025

Guzzi62 said:
Care to explain what that mean?

Software update right?

But what that mean practically I have no idea, guess you can compare it you all your apps on your phone getting updated?

They have updated the software to support features of the three FPGAs

Eg. ChatGPT explains

That note in the GitHub release document likely refers to the specific FPGA (Field-Programmable Gate Array) versions that the release aligns with. This suggests that the software or model update in the release is optimized for or compatible with these particular FPGA configurations.

Here's a breakdown of the numbers:

FPGA 1562 (2-node) → This FPGA version is designed for a 2-node architecture.
FPGA 1563 (6-node) → This FPGA version is configured for a 6-node system.
FPGA 1532 (ViT) → This FPGA version is optimized for ViT (Vision Transformer) models.

Fullmoonfever · Jan 31, 2025

uiux said:
They have updated the software to support features of the three FPGAs

Eg. ChatGPT explains

That note in the GitHub release document likely refers to the specific FPGA (Field-Programmable Gate Array) versions that the release aligns with. This suggests that the software or model update in the release is optimized for or compatible with these particular FPGA configurations.

Here's a breakdown of the numbers:

FPGA 1562 (2-node) → This FPGA version is designed for a 2-node architecture.

FPGA 1563 (6-node) → This FPGA version is configured for a 6-node system.

FPGA 1532 (ViT) → This FPGA version is optimized for ViT (Vision Transformer) models.

@uiux

You probs understand GitHub better and wondering if you have any thoughts on a previous post of mine from a little while ago?

Post in thread 'BRN Discussion Ongoing' https://thestockexchange.com.au/threads/brn-discussion-ongoing.1/post-444388

BRN Discussion Ongoing

Regular

Top 20

Top 20

Regular

Top 20

Top 20

Regular

Founding Member.

Founding Member.

Regular

Regular

Regular

Regular

Regular

Regular

Regular

Top 20

Regular

Regular

Top 20

Similar threads