BRN Discussion Ongoing

Diogenese · Jun 14, 2023

Diogenese said:
Hi Dr E,

I too share your confusion about Cerence.

However, there are some points which mitigate my concerns somewhat.

A few weeks ago, Mercedes started talking about MBUX not needing the "Hey Mercedes" wake up when there was only one person in the car, the corollary being that they still need it when there are two or more people.

Another thing is I've looked at Cerence patents, and while they discuss the use of NNs, they do not describe or claim any NN circuitry.

As you say, Mercedes found Akida to be 5 to 10 times beter than other systems for "Hey Mercedes". They also used "Hey Mercedes" as an example of what Akida could do, and appeared to make reference to plural uses of Akida.

On top of that, Mercedes also stated their desire to standardize on the chips they use. Akida is sensor agnostic.

Then there's Valeo Scala 3 lidar due out shortly, which I think may contain Akida, leaving aside Luminar with their foveated lidar and who have stated that they expect to expand their cooperation with Mercedes from mid-decade. MB used Scala 2 to obtain Level 3 ADAS certification, (sub-60 kph), while Scala 3 is rated to 160 kph.

Luminar, like Cerence, talk about using AI, but do not describe its construction.

Standardizing on Akida would improve the efficiency of the MB design office as their engineers would all be signing off the same hymn sheet in close harmony.

This patent application shows an acoustic classifier 152, a function which could be performed by Akida.

The combined classifier may be used with the 3 classifier inputs for the context discrimination to decide if the speaker is talking to the car or in conversation.

US2022343906A1 FLEXIBLE-FORMAT VOICE COMMAND

[0042] As introduced above, the reasoner 150 processes both text output 115 and the audio signal 105 . The audio signal 105 is processed by an acoustic classifier 152 . In some implementations, this classifier is a machine learning classifier that is configured with data (i.e., from configuration data 160 ) that was trained on examples of system-directed and of non-system directed utterance by an offline training system 180 . In some examples, the machine-learning component of the acoustic classifier 152 receives a fixed-length representation of the utterance (or at least the part of the utterance received to that point) and outputs a score (e.g., probability, log likelihood, etc.) that represents a confidence that the utterance is a command. For example, the machine-learning component can be a deep neural network. Note that such processing does not in general depend on any particular words in the input, and may instead be based on features such as duration, amplitude, or pitch variation (e.g., rising or falling pitch). In some implementations, the machine-learning component processes a sequence, for example, processing a sequence of signal processing features (e.g., corresponding to fixed-length frames) that represent time-local characteristics of the signal, such as amplitude, spectral, and/or pitch, and the machine-learning component processes the sequence to provide the output score. For example, the machine learning component can implement a convolutional or recurrent neural network.

[0051] In situations in which the reasoner 150 determines that an utterance is a system-directed command directed to a particular assistant, it sends a reasoner output 155 to one of the assistants 140 A-Z with which the system 100 is configured. As an example, assistant 140 A includes a natural language understanding (NLU) 120 , whose output representing the meaning or intent of the command is passed to a command processor 130 , which acts on the determined meaning or intent.

[0052] Various technical approaches may be used in the NLU component, including deterministic or probabilistic parsing according to a grammar provided from the configuration data 160 , of machine-learning based mapping of the text output 115 to a representation of meaning, for example, using neural networks configured to classify the text output and/or identify particular words as providing variable values (e.g., “slot” values) for identified commands. The NLU component 120 may provide an indication of a general class of commands (e.g., a “skill”) or a specific command (e.g., and “intent”), as well as values of variables associated with the command. The configuration of the assistant 140 A may use configuration data that is determined using a training procedure and stored with other configuration data in the configuration data storage 160 .

I'm guessing that Akida 2 could be used in NLU 120.

robsmark · Jun 14, 2023

manny100 said:
Check the About/Investor relations area of the BRN website.
BRN list 7 reasons under Why Invest.
One of them is:
"Marquee brands include Mercedes, Valeo, Vorago, and NASA, and commercial IP licenses with Renesas and MegaChips.
Commercial availability of semiconductor chips, IP, tools, and boards."
They would not be including these companies on their website 'in writing' if there was no connection.
They are all subject to NDA's and from engagement to revenue is a fair time.
Patience is the key.
My bold.

Again, I reiterate, Mercedes is not a customer. They have not signed, and have not committed, and they are not currently a customer. It’s that simple.

gwinny66 · Jun 14, 2023

Diogenese said:
This patent application shows an acoustic classifier 152, a function which could be performed by Akida.

The combined classifier may be used with the 3 classifier inputs for the context discrimination to decide if the speaker is talking to the car or in conversation.

US2022343906A1 FLEXIBLE-FORMAT VOICE COMMAND

View attachment 38351

[0042] As introduced above, the reasoner 150 processes both text output 115 and the audio signal 105 . The audio signal 105 is processed by an acoustic classifier 152 . In some implementations, this classifier is a machine learning classifier that is configured with data (i.e., from configuration data 160 ) that was trained on examples of system-directed and of non-system directed utterance by an offline training system 180 . In some examples, the machine-learning component of the acoustic classifier 152 receives a fixed-length representation of the utterance (or at least the part of the utterance received to that point) and outputs a score (e.g., probability, log likelihood, etc.) that represents a confidence that the utterance is a command. For example, the machine-learning component can be a deep neural network. Note that such processing does not in general depend on any particular words in the input, and may instead be based on features such as duration, amplitude, or pitch variation (e.g., rising or falling pitch). In some implementations, the machine-learning component processes a sequence, for example, processing a sequence of signal processing features (e.g., corresponding to fixed-length frames) that represent time-local characteristics of the signal, such as amplitude, spectral, and/or pitch, and the machine-learning component processes the sequence to provide the output score. For example, the machine learning component can implement a convolutional or recurrent neural network.

[0051] In situations in which the reasoner 150 determines that an utterance is a system-directed command directed to a particular assistant, it sends a reasoner output 155 to one of the assistants 140 A-Z with which the system 100 is configured. As an example, assistant 140 A includes a natural language understanding (NLU) 120 , whose output representing the meaning or intent of the command is passed to a command processor 130 , which acts on the determined meaning or intent.

[0052] Various technical approaches may be used in the NLU component, including deterministic or probabilistic parsing according to a grammar provided from the configuration data 160 , of machine-learning based mapping of the text output 115 to a representation of meaning, for example, using neural networks configured to classify the text output and/or identify particular words as providing variable values (e.g., “slot” values) for identified commands. The NLU component 120 may provide an indication of a general class of commands (e.g., a “skill”) or a specific command (e.g., and “intent”), as well as values of variables associated with the command. The configuration of the assistant 140 A may use configuration data that is determined using a training procedure and stored with other configuration data in the configuration data storage 160 .

I'm guessing that Akida 2 could be used in NLU 120.

Isn't Cerence a software company? Not a hardware IP provider, I believe they are part of the Hey Mercedes voice activation system as well?

Diogenese · Jun 14, 2023

Fullmoonfever said:
Not sure of date on this article, but suggests DNN embedded and cloud?

CERENCE INTRODUCES NEW FEATURES IN CERENCE DRIVE, THE WORLD’S LEADING TECHNOLOGY AND SOLUTIONS PORTFOLIO FOR AUTOMAKERS AND CONNECTED CARS

New capabilities such as enhanced voice recognition and synthetic speech serve as the foundation for a safer, more enjoyable journey for everyone

iot-automotive.news

CERENCE INTRODUCES NEW FEATURES IN CERENCE DRIVE, THE WORLD’S LEADING TECHNOLOGY AND SOLUTIONS PORTFOLIO FOR AUTOMAKERS AND CONNECTED CARS
New capabilities such as enhanced voice recognition and synthetic speech serve as the foundation for a safer, more enjoyable journey for everyone
BURLINGTON, Mass. – Cerence Inc., AI for a world in motion, today introduced new innovations in Cerence Drive, its technology and solutions portfolio for automakers and IoT providers to build high-quality, intelligent voice assistant experiences and speech-enabled applications. Cerence Drive today powers AI-based, voice-enabled assistants in approximately 300 million cars from nearly every major automaker in the world, including Audi, BMW, Daimler, Ford, Geely, GM, SAIC, Toyota, and many more.
The Cerence Drive portfolio offers a distinct, hybrid approach with both on-board and cloud-based technologies that include voice recognition, natural language understanding (NLU), text-to-speech (TTS), speech signal enhancement (SSE), and more. These technologies can be deployed and tightly integrated with the wide variety of systems, sensors and interfaces found in today’s connected cars. The latest version of Cerence Drive includes a variety of new features to elevate the in-car experience:
> Enhanced, active voice recognition and assistant activation that goes beyond the standard push-to-talk buttons and wake-up words. The voice assistant is always listening for a relevant utterance, question or command, much like a personal assistant would, creating a more natural experience. In addition, Cerence’s voice recognition can run throughout the car, both embedded and in the cloud, distributing the technical load and delivering a faster user experience for drivers.
> New, deep neural net (DNN)-based NLU engine built on one central technology stack with 23 languages available both embedded and in the cloud. This streamlined approach creates new standards for scalability and flexibility between embedded and cloud applications and domains for simpler integration, faster innovation, and a more seamless in-car experience, regardless of connectivity.
> TTS and synthetic voice advancements that deliver new customizations, including a non-gender-specific voice for the voice assistant, and emotional output, which enables automakers to adjust an assistant’s speaking style based on the information delivered or tailored to a specific situation. In addition, the introduction of deep learning delivers a more natural and human-like voice with an affordable computational footprint.
> Improved, more intelligent speech signal enhancement that includes multi-zone processing with quick and simple speaker identification; passenger interference cancelation that blocks out background noise as well as voices from others in the car; and a deep neural net-based approach for greater noise suppression and better communication.
“Improving the experience for drivers and creating curated technology that feels unique and harmonious with our partners’ brands have been true motivators since we started our new journey as Cerence, and that extends to our latest innovations in Cerence Drive,” said Sanjay Dhawan, CEO, Cerence. “Cerence Drive, our flagship offering, is the driving force behind our promise of a truly moving in-car experience for our customers and their drivers, and our new innovations announced today are core to making that mission a reality. ”
Cerence Drive’s newest features are available now for automakers worldwide. To learn more about Cerence Drive, visit www.cerence.com/solutions.

Also a 2022 pdf spiel on their overall solutions package.

HERE

Guess we have to remember we also have a patent granted on 2018 on neuromorphic application via PVDM.

Unified Patents - Analytics Portal

portal.unifiedpatents.com

US-10157629-B2 - Low Power Neuromorphic Voice Activation System and Method

Abstract
The present invention provides a system and method for controlling a device by recognizing voice commands through a spiking neural network. The system comprises a spiking neural adaptive processor receiving an input stream that is being forwarded from a microphone, a decimation filter and then an artificial cochlea. The spiking neural adaptive processor further comprises a first spiking neural network and a second spiking neural network. The first spiking neural network checks for voice activities in output spikes received from artificial cochlea. If any voice activity is detected, it activates the second spiking neural network and passes the output spike of the artificial cochlea to the second spiking neural network that is further configured to recognize spike patterns indicative of specific voice commands. If the first spiking neural network does not detect any voice activity, it halts the second spiking neural network.

Hi Fmf,

These are the rest of the drawings from the Cerence patent above:

Damo4 · Jun 14, 2023

Diogenese said:
Hi Fmf,

These are the rest of the drawings from the Cerence patent above:

View attachment 38357

View attachment 38358

Hi Dio,

Just reading through your posts on Cerence, thank you for all the info.

I have a question about the classification. Do you think it will be a prebuilt/trained model, or is it capable of learning without cloud intervention?
If it is capable of on device learning, would the occupants of the car have to provide feedback regarding accuracy or the responses?

Deleted member 118 · Jun 14, 2023

robsmark said:
Again, I reiterate, Mercedes is not a customer. They have not signed, and have not committed, and they are not currently a customer. It’s that simple.

Fullmoonfever · Jun 14, 2023

Diogenese said:
Hi Fmf,

These are the rest of the drawings from the Cerence patent above:

View attachment 38357

View attachment 38358

So 250 & 251 are the key regarding non use of a wake word and initiating the assistant or not.

Frangipani · Jun 14, 2023

Diogenese said:
... like O'Ryan, the archer friend of Diana's Greek alter ego?

... or should that be "archest"?

Mccabe84 · Jun 14, 2023

robsmark said:
I stand by my comment, until we have a signed deal then it’s nothing. Chat, partnerships, discussions, articles, and tweets mean diddly squat unless they sign, start using Akida, and revenue starts flowing. Until then they have zero commitment towards Brainchip. That’s just the way it is.

All the NDA talk also mean diddly squat too. It may or not be true..

Diogenese · Jun 14, 2023

Damo4 said:
Hi Dio,

Just reading through your posts on Cerence, thank you for all the info.

I have a question about the classification. Do you think it will be a prebuilt/trained model, or is it capable of learning without cloud intervention?
If it is capable of on device learning, would the occupants of the car have to provide feedback regarding accuracy or the responses?

Hi Damo,

Glad you asked. I was just about to post that 10 years ago, Cerence was an entirely cloud-based system.

I'm pretty certain that they had no in-house digital or analog SNN SoC expertise, at least until they met BRN which they must have done over at Markus Schäfer's place. Their patents talk about NN as a known quantity but do not evince any evidence of a deep knowledge of the circuit configuration.

The Cerence patent has a priority of 20210426.

Just harking back to Fig1 above, there is no provision for on-line learning in the acoustic classifier:

[0038] The configuration data 160 is generally or largely determined before operation of the system. For example, an offline training/configuration component 180 uses an audio or transcribed text corpus of commands for one or more assistants to determine the structure of valid commands for those assistants, and a language model may be derived from the corpus for use in processing audio input containing speech. For example, the language model may be a statistical language model, meaning that different word sequences are associated with different probability or scores indicating their likelihood. In some cases, such a statistical language model is an “n-gram” model in which statistics on n-long sequences of words are used to build the model. In some cases, some or all of the language model may have a finite-state form, for example, specifying a finite set of well-structured commands. In some cases, such a finite-state form is statistical in that different commands may have different probabilities or scores. In some examples, a combination of approaches, such as n-gram and finite-state forms, are combined in a finite-state transducer. It should be understood that there are a number of alternatives to the form of the language model. In some examples, the occurrences of wake words and/or names given to assistants, or generic subjects that may not be explicitly defined (e.g., “computer,” “automobile,” “you”), are identified in the corpus and are essentially replaced with placeholders to permit configuration of different words or names for assistants, including coined names that may not actually occur in the corpus, for use in runtime systems without having to derive new language models. In some examples, the language model used to automatically transcribe an audio input may be biased to avoid missing true occurrences of wake words that are part of commands in the process of automated transcription of audio input. In some examples, the language model provides a way to tag output words, for example, to indicate that the word occurred in the position of a subject or assistant name in the language model, and such tags are used in further processing of the output text.

[0039] In some examples, the configuration data 160 may be determined in part during operation or shortly before operation of the system. For example, an optional online training/configuration component 170 may receive, from the user, a name they have given to an assistant. For example, the user may say or type the word “Sophie” to name the assistant. The configuration data 160 is then amended to modify the language model to permit the word “Sophie” in the name position in the language model and/or modify the pronunciation of a placeholder for the name with a pronunciation of the word “Sophie,” for example, determined by an automated text-to-phoneme converter (e.g., as is used in a speech synthesis system). In some examples, the online training/configuration component 170 may receive other information that is used in configuring components of the system, for example, the names of family members of the user and/or names of passengers in a vehicle. These names may be used to configure the language model to replace a placeholder for non-system names that may be present in utterances spoken by the user but that are not directed to the system.

[0042] As introduced above, the reasoner 150 processes both text output 115 and the audio signal 105 . The audio signal 105 is processed by an acoustic classifier 152 . In some implementations, this classifier is a machine learning classifier that is configured with data (i.e., from configuration data 160 ) that was trained on examples of system-directed and of non-system directed utterance by an offline training system 180 . In some examples, the machine-learning component of the acoustic classifier 152 receives a fixed-length representation of the utterance (or at least the part of the utterance received to that point) and outputs a score (e.g., probability, log likelihood, etc.) that represents a confidence that the utterance is a command. For example, the machine-learning component can be a deep neural network. Note that such processing does not in general depend on any particular words in the input, and may instead be based on features such as duration, amplitude, or pitch variation (e.g., rising or falling pitch). In some implementations, the machine-learning component processes a sequence, for example, processing a sequence of signal processing features (e.g., corresponding to fixed-length frames) that represent time-local characteristics of the signal, such as amplitude, spectral, and/or pitch, and the machine-learning component processes the sequence to provide the output score. For example, the machine learning component can implement a convolutional or recurrent neural network.

Diogenese · Jun 14, 2023

Fullmoonfever said:
So 250 & 251 are the key regarding non use of a wake word and initiating the assistant or not.

Yes. That is the function of combined classifier 151 in Fig. 1.

Bravo · Jun 14, 2023

Dr E Brown said:
Robsmark, it is a genuine question. You are correct that Mercedes are not announced as a customer. However, they are on the trusted by section of our website, implies using Akida for future opportunities. They have openly stated that use of Akida for, I understand, to be this system reduces power consumption by 5x, or words to that effect.
My question is - are Cerence using Akida in their offering or is there a pivot, or am I confusing apples and oranges.

Perhaps it is my lack of understanding that is leading to my confusion. The tweet and information at least gave me cause for pause. I honestly need help to understand if there is anything here or not.
And if given all the continued chat about Mercedes announcement by the top dogs in the company, should I really be considering that as nothing?!

Cerence is right down my alley...

#44,294

Fullmoonfever · Jun 14, 2023

Diogenese said:
Hi Damo,

Glad you asked. I was just about to post that 10 years ago, Cerence was an entirely cloud-based system.

I'm pretty certain that they had no in-house digital or analog SNN SoC expertise, at least until they met BRN which they must have done over at Markus Schäfer's place. Their patents talk about NN as a known quantity but do not evince any evidence of a deep knowledge of the circuit configuration.

The Cerence patent has a priority of 20210426.

Just harking back to Fig1 above, there is no provision for on-line learning in the acoustic classifier:

View attachment 38364

[0038] The configuration data 160 is generally or largely determined before operation of the system. For example, an offline training/configuration component 180 uses an audio or transcribed text corpus of commands for one or more assistants to determine the structure of valid commands for those assistants, and a language model may be derived from the corpus for use in processing audio input containing speech. For example, the language model may be a statistical language model, meaning that different word sequences are associated with different probability or scores indicating their likelihood. In some cases, such a statistical language model is an “n-gram” model in which statistics on n-long sequences of words are used to build the model. In some cases, some or all of the language model may have a finite-state form, for example, specifying a finite set of well-structured commands. In some cases, such a finite-state form is statistical in that different commands may have different probabilities or scores. In some examples, a combination of approaches, such as n-gram and finite-state forms, are combined in a finite-state transducer. It should be understood that there are a number of alternatives to the form of the language model. In some examples, the occurrences of wake words and/or names given to assistants, or generic subjects that may not be explicitly defined (e.g., “computer,” “automobile,” “you”), are identified in the corpus and are essentially replaced with placeholders to permit configuration of different words or names for assistants, including coined names that may not actually occur in the corpus, for use in runtime systems without having to derive new language models. In some examples, the language model used to automatically transcribe an audio input may be biased to avoid missing true occurrences of wake words that are part of commands in the process of automated transcription of audio input. In some examples, the language model provides a way to tag output words, for example, to indicate that the word occurred in the position of a subject or assistant name in the language model, and such tags are used in further processing of the output text.

[0039] In some examples, the configuration data 160 may be determined in part during operation or shortly before operation of the system. For example, an optional online training/configuration component 170 may receive, from the user, a name they have given to an assistant. For example, the user may say or type the word “Sophie” to name the assistant. The configuration data 160 is then amended to modify the language model to permit the word “Sophie” in the name position in the language model and/or modify the pronunciation of a placeholder for the name with a pronunciation of the word “Sophie,” for example, determined by an automated text-to-phoneme converter (e.g., as is used in a speech synthesis system). In some examples, the online training/configuration component 170 may receive other information that is used in configuring components of the system, for example, the names of family members of the user and/or names of passengers in a vehicle. These names may be used to configure the language model to replace a placeholder for non-system names that may be present in utterances spoken by the user but that are not directed to the system.

[0042] As introduced above, the reasoner 150 processes both text output 115 and the audio signal 105 . The audio signal 105 is processed by an acoustic classifier 152 . In some implementations, this classifier is a machine learning classifier that is configured with data (i.e., from configuration data 160 ) that was trained on examples of system-directed and of non-system directed utterance by an offline training system 180 . In some examples, the machine-learning component of the acoustic classifier 152 receives a fixed-length representation of the utterance (or at least the part of the utterance received to that point) and outputs a score (e.g., probability, log likelihood, etc.) that represents a confidence that the utterance is a command. For example, the machine-learning component can be a deep neural network. Note that such processing does not in general depend on any particular words in the input, and may instead be based on features such as duration, amplitude, or pitch variation (e.g., rising or falling pitch). In some implementations, the machine-learning component processes a sequence, for example, processing a sequence of signal processing features (e.g., corresponding to fixed-length frames) that represent time-local characteristics of the signal, such as amplitude, spectral, and/or pitch, and the machine-learning component processes the sequence to provide the output score. For example, the machine learning component can implement a convolutional or recurrent neural network.

So, reading the paragraph after & relating to 180 it seems to me they have essentially replaced the phrase " wake word" and embedded or left open placeholders to be later embedded, words or phrases with weightings so the assistant recognises them....just like a wake word without specifically saying "hey...whoever"

Deleted member 118 · Jun 14, 2023

A good read

Using AI for loans and mortgages is big risk, warns EU boss

Margrethe Vestager tells the BBC using AI for decisions that affect lives could lead to discrimination.

www.bbc.com

Diogenese · Jun 14, 2023

Fullmoonfever said:
So, reading the paragraph after & relating to 180 it seems to me they have essentially replaced the phrase " wake word" and embedded or left open placeholders to be later embedded, words or phrases with weightings so the assistant recognises them....just like a wake word without specifically saying "hey...whoever"

Just a note of caution about patents - they are usually not published until 18 months after filing, so they may have a revised version in the pipeline.

Fullmoonfever said:
So, reading the paragraph after & relating to 180 it seems to me they have essentially replaced the phrase " wake word" and embedded or left open placeholders to be later embedded, words or phrases with weightings so the assistant recognises them....just like a wake word without specifically saying "hey...whoever"

... but they also have the option of using the video input to determine the speaker's intention:

0045] In some examples, the reasoner implements a classifier that takes advantage of yet other input to make the classification. For example, video from the camera 103 monitoring the user's facial expressions, lip movement, etc., is provided to a visual classifier 157 of the reasoner 150 to aid in the classification. In a machine learning approach, such a video signal may be processed using a machine learning approach that is trained on video (i.e., image sequences) for users making system-directed and non-system-directed utterances. In some examples, the video is processed in conjunction with the audio, essentially combining functions of the acoustic classifier 152 and visual classifier 157 (e.g., to permit taking advantage of relationships between visual and audio cues that help determine when the user intends for an utterance to be system directed).

TECH · Jun 14, 2023

robsmark said:
I stand by my comment, until we have a signed deal then it’s nothing. Chat, partnerships, discussions, articles, and tweets mean diddly squat unless they sign, start using Akida, and revenue starts flowing. Until then they have zero commitment towards Brainchip. That’s just the way it is.

What you have stated is 100% how many would feel due mainly because of the lack of IP signings, and revenue streams having been
established as yet, to counter your view, I have been told that Mercedes Benz is a "client" and will do what they will do with our technology,
as is their right...read that as you wish.

Sean is certainly moving around the globe now, or appears to be, I personally think the name Brainchip and more importantly, AKIDA 2.0
is starting to be more freely talked about in the tech world and the word success can take many forms, and to this point we have had some
tremendous success but ultimately we all know that a business can't survive without revenue, have faith it's coming.

What makes me laugh somewhat is persons whom bag our company, but still feel comfortable holding, is that a modern definition of a
"fence sitter" or a politician ?

Tech in Melbourne

AKD 3.0

Iseki · Jun 14, 2023

HopalongPetrovski said:
Ahhhhh, I see, you're looking for an argument.
That room is three doors on the left down the hall.

Not at all.
When you say that there are wonderful things happening only we can't know about them, you are causing a distrust of the board that will bring about an outcome that neither you or I want. Please desist. And let's not fight.
Instead let's find a new way that BRN can spread the word AND give shareholders something to hang on to.

eg - Let's give the akida IP away free for the first three medical devices that can utilize it. Something like that - win, win, win.

Diogenese · Jun 14, 2023

Fullmoonfever said:
Not sure of date on this article, but suggests DNN embedded and cloud?

CERENCE INTRODUCES NEW FEATURES IN CERENCE DRIVE, THE WORLD’S LEADING TECHNOLOGY AND SOLUTIONS PORTFOLIO FOR AUTOMAKERS AND CONNECTED CARS

New capabilities such as enhanced voice recognition and synthetic speech serve as the foundation for a safer, more enjoyable journey for everyone

iot-automotive.news

CERENCE INTRODUCES NEW FEATURES IN CERENCE DRIVE, THE WORLD’S LEADING TECHNOLOGY AND SOLUTIONS PORTFOLIO FOR AUTOMAKERS AND CONNECTED CARS
New capabilities such as enhanced voice recognition and synthetic speech serve as the foundation for a safer, more enjoyable journey for everyone
BURLINGTON, Mass. – Cerence Inc., AI for a world in motion, today introduced new innovations in Cerence Drive, its technology and solutions portfolio for automakers and IoT providers to build high-quality, intelligent voice assistant experiences and speech-enabled applications. Cerence Drive today powers AI-based, voice-enabled assistants in approximately 300 million cars from nearly every major automaker in the world, including Audi, BMW, Daimler, Ford, Geely, GM, SAIC, Toyota, and many more.
The Cerence Drive portfolio offers a distinct, hybrid approach with both on-board and cloud-based technologies that include voice recognition, natural language understanding (NLU), text-to-speech (TTS), speech signal enhancement (SSE), and more. These technologies can be deployed and tightly integrated with the wide variety of systems, sensors and interfaces found in today’s connected cars. The latest version of Cerence Drive includes a variety of new features to elevate the in-car experience:
> Enhanced, active voice recognition and assistant activation that goes beyond the standard push-to-talk buttons and wake-up words. The voice assistant is always listening for a relevant utterance, question or command, much like a personal assistant would, creating a more natural experience. In addition, Cerence’s voice recognition can run throughout the car, both embedded and in the cloud, distributing the technical load and delivering a faster user experience for drivers.
> New, deep neural net (DNN)-based NLU engine built on one central technology stack with 23 languages available both embedded and in the cloud. This streamlined approach creates new standards for scalability and flexibility between embedded and cloud applications and domains for simpler integration, faster innovation, and a more seamless in-car experience, regardless of connectivity.
> TTS and synthetic voice advancements that deliver new customizations, including a non-gender-specific voice for the voice assistant, and emotional output, which enables automakers to adjust an assistant’s speaking style based on the information delivered or tailored to a specific situation. In addition, the introduction of deep learning delivers a more natural and human-like voice with an affordable computational footprint.
> Improved, more intelligent speech signal enhancement that includes multi-zone processing with quick and simple speaker identification; passenger interference cancelation that blocks out background noise as well as voices from others in the car; and a deep neural net-based approach for greater noise suppression and better communication.
“Improving the experience for drivers and creating curated technology that feels unique and harmonious with our partners’ brands have been true motivators since we started our new journey as Cerence, and that extends to our latest innovations in Cerence Drive,” said Sanjay Dhawan, CEO, Cerence. “Cerence Drive, our flagship offering, is the driving force behind our promise of a truly moving in-car experience for our customers and their drivers, and our new innovations announced today are core to making that mission a reality. ”
Cerence Drive’s newest features are available now for automakers worldwide. To learn more about Cerence Drive, visit www.cerence.com/solutions.

Also a 2022 pdf spiel on their overall solutions package.

HERE

Guess we have to remember we also have a patent granted on 2018 on neuromorphic application via PVDM.

Unified Patents - Analytics Portal

portal.unifiedpatents.com

US-10157629-B2 - Low Power Neuromorphic Voice Activation System and Method

Abstract
The present invention provides a system and method for controlling a device by recognizing voice commands through a spiking neural network. The system comprises a spiking neural adaptive processor receiving an input stream that is being forwarded from a microphone, a decimation filter and then an artificial cochlea. The spiking neural adaptive processor further comprises a first spiking neural network and a second spiking neural network. The first spiking neural network checks for voice activities in output spikes received from artificial cochlea. If any voice activity is detected, it activates the second spiking neural network and passes the output spike of the artificial cochlea to the second spiking neural network that is further configured to recognize spike patterns indicative of specific voice commands. If the first spiking neural network does not detect any voice activity, it halts the second spiking neural network.

This Cerence patent uses hybrid SoC/Cloud:

US11462216B2 Hybrid arbitration system
A method for selecting a speech recognition result on a computing device includes receiving a first speech recognition result determined by the computing device, receiving first features, at least some of the features being determined using the first speech recognition result, determining whether to select the first speech recognition result or to wait for a second speech recognition result determined by a cloud computing service based at least in part on the first speech recognition result and the first features.

Diogenese said:
Hi Dr E,

I too share your confusion about Cerence.

However, there are some points which mitigate my concerns somewhat.

A few weeks ago, Mercedes started talking about MBUX not needing the "Hey Mercedes" wake up when there was only one person in the car, the corollary being that they still need it when there are two or more people.

Another thing is I've looked at Cerence patents, and while they discuss the use of NNs, they do not describe or claim any NN circuitry.

As you say, Mercedes found Akida to be 5 to 10 times beter than other systems for "Hey Mercedes". They also used "Hey Mercedes" as an example of what Akida could do, and appeared to make reference to plural uses of Akida.

On top of that, Mercedes also stated their desire to standardize on the chips they use. Akida is sensor agnostic.

Then there's Valeo Scala 3 lidar due out shortly, which I think may contain Akida, leaving aside Luminar with their foveated lidar and who have stated that they expect to expand their cooperation with Mercedes from mid-decade. MB used Scala 2 to obtain Level 3 ADAS certification, (sub-60 kph), while Scala 3 is rated to 160 kph.

Luminar, like Cerence, talk about using AI, but do not describe its construction.

Standardizing on Akida would improve the efficiency of the MB design office as their engineers would all be signing off the same hymn sheet in close harmony.

https://www.bing.com/videos/search?&q=Comedian+Harmonists+Songs&view=detail&mid=44C7F5D2E5E5041230E244C7F5D2E5E5041230E2&FORM=VDRVRV&ru=/videos/search?&q=Comedian+Harmonists+Songs&FORM=VDRSCL&ajaxhist=0

https://www.bing.com/videos/search?...edian+Harmonists+Songs&FORM=VDRSCL&ajaxhist=0

robsmark · Jun 14, 2023

TECH said:
What you have stated is 100% how many would feel due mainly because of the lack of IP signings, and revenue streams having been
established as yet, to counter your view, I have been told that Mercedes Benz is a "client" and will do what they will do with our technology,
as is their right...read that as you wish.

Sean is certainly moving around the globe now, or appears to be, I personally think the name Brainchip and more importantly, AKIDA 2.0
is starting to be more freely talked about in the tech world and the word success can take many forms, and to this point we have had some
tremendous success but ultimately we all know that a business can't survive without revenue, have faith it's coming.

What makes me laugh somewhat is persons whom bag our company, but still feel comfortable holding, is that a modern definition of a
"fence sitter" or a politician ?

Tech in Melbourne AKD 3.0

Hey Tech,

For what it’s worth, I think that Mercedes will likely be a future customer, but as I have repeatedly mentioned in my last few posts, find it very unlikely that they currently are as it hasnt been announced and the company has a continual disclosure as per ASX requirements to release this information as soon as it becomes available.

I have no issues with Sean travelling overseas to promote the product, in fact as a CEO I would expect it.

Cheers.

manny100 · Jun 14, 2023

Mccabe84 said:
All the NDA talk also mean diddly squat too. It may or not be true..

That is the risk. If it all comes off as expected these prices will seem cheap. If not well...
Its just patience from here. There will be some who cannot stand the wait and sell. There will be others who maintain a hold or a hold and accumulate like myself.
BRN has at least a 2 plus year of first mover advantage so we need it to gather momentum before others catch up.
I note AKIDA 3 will be on the drawing board later this year.

BRN Discussion Ongoing

Top 20

Regular

Emerged

Top 20

CERENCE INTRODUCES NEW FEATURES IN CERENCE DRIVE, THE WORLD’S LEADING TECHNOLOGY AND SOLUTIONS PORTFOLIO FOR AUTOMAKERS AND CONNECTED CARS​

US-10157629-B2 - Low Power Neuromorphic Voice Activation System and Method​

Regular

Deleted member 118

Guest

Top 20

Top 20

Regular

Top 20

Top 20

If ARM was an arm, BRN would be its biceps💪!

Top 20

Deleted member 118

Guest

Top 20

Regular

Regular

Top 20

CERENCE INTRODUCES NEW FEATURES IN CERENCE DRIVE, THE WORLD’S LEADING TECHNOLOGY AND SOLUTIONS PORTFOLIO FOR AUTOMAKERS AND CONNECTED CARS​

US-10157629-B2 - Low Power Neuromorphic Voice Activation System and Method​

Regular

Top 20

Similar threads

CERENCE INTRODUCES NEW FEATURES IN CERENCE DRIVE, THE WORLD’S LEADING TECHNOLOGY AND SOLUTIONS PORTFOLIO FOR AUTOMAKERS AND CONNECTED CARS

US-10157629-B2 - Low Power Neuromorphic Voice Activation System and Method

CERENCE INTRODUCES NEW FEATURES IN CERENCE DRIVE, THE WORLD’S LEADING TECHNOLOGY AND SOLUTIONS PORTFOLIO FOR AUTOMAKERS AND CONNECTED CARS

US-10157629-B2 - Low Power Neuromorphic Voice Activation System and Method