A recent good article explaining various areas of the process / market worth a read.
It won't replace ML chips, but it could broaden the market.
semiengineering.com
Machine Learning Showing Up As Silicon IP
472Shares
It won’t replace ML chips, but it could broaden the market.
MARCH 3RD, 2022 - BY:
BRYON MOYER
New machine-learning (ML) architectures continue to appear. Up to now, each new offering has been implemented in a chip for sale, to be placed alongside host processors, memory, and other chips on an accelerator board. But over time, more of this technology could be sold as IP that can be integrated into a system-on-chip (SoC).
That trend is evident at recent conferences, where an increasing number of announcements involve IP for which there may or may not be physical chips available.
“For customers who want to jump straight into
machine learning and make their products smarter, it’s easier just to buy an AI chip with an existing tooling process and add that to your existing architectural platform,” said Danny Watson, principal engineer for the ICW business line at
Infineon. “Going forward, when they’re going to make platform decisions across the whole portfolio, that’s when they’re going to have the architecture integrated into an SoC directly.”
This may accelerate the adoption of ML in dedicated applications, and it may be a faster route to market for ML hardware providers. While this may add options for system designers,
IP comes with its own set of challenges.
Something completely new
While new technology appears every day in some form or another, most of what’s introduced is evolutionary. Typically, it involves a faster version of something, a new communications protocol, or a different way of storing data. It’s far less common that something completely new appears, adding a capability that was not possible before.
This is why ML’s impact has been so significant. While the ideas underlying the technology have been around for a while, it’s been only recently that silicon technology has made it possible for ML to be deployed at scales that previously were considered unfeasible. ML is a wholly new concept, not just a different way of doing something or an integration play. It has made possible solutions to problems that were not tractable before, and it has allowed system designers to conceive of equipment that would have been unthinkable until recently.
But as a new concept, there also is no best way to do it yet. The industry has been in the early stages of figuring out how to make it work, and there are lots of moving parts that can be tuned. So there are numerous proposals and offerings on different ways to solve problems.
It’s not a one-solution-fits-all situation either. What works best for one problem may be sub-optimal for another. The challenge is figuring out how much to specialize and how much to stay general-purpose.
The last time the chip industry was in this situation was decades ago — perhaps the availability of SSI logic or the microprocessor. Everything since then has been an improvement and better integration. Decades later, these functions are widely available as IP.
The new “new kid in town”
Now that the technology necessary for ML is available, the industry seems to be charting a similar trajectory to logic, but on an extremely compressed timescale. The first big challenge is how to best implement ML capabilities. As with logic in the beginning, that has meant the availability of individual chips performing the ML functions. System designers can include them as accelerators either in parallel with their main CPUs (if the chip includes a host), or by using their CPU as a host to control the ML tasks.
These chips originally were designed onto boards or modules dedicated to ML. With an original focus on the cloud, an entire board with a PCIe interface might be dedicated to ML training or inference. But with ML being offered as IP instead of individual chips, ML functions can be wrapped into SoCs, which in turn reduces the overall footprint on a board.
But as the industry increasingly moves to chiplets and disaggregates large SoCs, this business model may work for some startups. “If you don’t have hardware out by now, then you may strategically say, ‘Well, I’m gonna sell it as IP,’” said Dana McCarty, vice president of sales and marketing for inference products at
Flex Logix.
Fig. 1: New neural architectures have largely been implemented as their own chip for inclusion on a board (left). Now offerings are starting to include IP for inclusion on a chip (right). Source: Bryon Moyer/Semiconductor Engineering
Cloud vs. edge
The impetus for ML IP will partially depend on where the ML will be instantiated. The original ML focus was on the
cloud or in other
data centers, where accelerator boards have proven to be a good solution. That becomes even more the case when considering a disaggregated data center of the future, where resources can be pulled as necessary. In the cloud, it makes more sense for ML functionality to stand alone so that only as much as is needed can be roped into a particular project. If it were built into every server, then it might sit idle while the server worked on projects not needing ML.
Data center form factors thus provide a constraint. “Cloud and data centers have much more fixed infrastructure,” said Nick Ni, director of product marketing for AI and software at
AMD. “You can’t just change the PCIe form factor. There’s a full rack of data-center servers that’s already been deployed, so you have to live within the constraints.”
As inference moves to the
edge, that’s not necessary. “The edge is completely different,” said Ni. “There’s more flexibility there. But it’s also much less adopted today, because there’s so much randomness in the hardware. In automotive, drones, and medical applications, it’s like 2% adoption. The hardware market still has a huge untapped potential, and nobody’s a winner so far.”
For the edge, integration into an SoC could make sense. Unlike server-based applications, these tend to focus on specific problems, so the solution can be tailored. Edge devices tend to be small, so having fewer packages also makes sense. And an SoC designer can manage the power and performance of the ML IP block more directly as compared to what would be possible with a dedicated chip.
There are typically two types of ML chips — those intended for training, with facilities for back-propagation, and those intended only for inference. Training is likely to be a cloud-based activity for the foreseeable future, so it’s not obvious that an IP version of a training-oriented ML architecture would make sense.
Embedded ML functions are much more likely to be focused on inference. To the extent that such a device might want to improve its model over time, data would be sent back to the cloud for further training rather than attempting training in the edge device itself (with the exception of limited
incremental learning). That could change if new training techniques arise, but for the current mainstream, IP will likely be limited to inference.
The impact of standards
The move to IP has often been spurred by standards that limit how much creativity can be brought to some functions. When implementing PCIe, for example, differentiation cannot rely on adding novel features because the features themselves are specified in the standards. So differentiation occurs based on how those features are realized. Speed is often captured in a standard, so just being faster often has to do with how much headroom is available. That leaves power and cost as the major silicon-based characteristics for differentiation.
But the other big opportunity lies in ease of use. Standards-based IP, in particular, provides a well-defined set of implementation options such as bus widths, security choices, or optional features. A company that makes it easy for designers to implement its IP will have less pressure on things like price when competing.
But it hasn’t been only standards that have benefited from IP. Infineon’s Watson points out that the audio market, for instance, has seen chip-based implementations gradually move into IP for greater integration, even though that doesn’t come as a result of standards circumscribing the available options.
Somewhere between standards-based and completely ad-hoc IP are IP blocks that have become de-facto standards, typically by virtue of market power. Arm processors, for example, aren’t industry standards for how to implement a processor, as the growing popularity of RISC-V shows, but they are prevalent enough to have set an expectation that a certain class of processor be available as IP.
Processors are at the heart of most
SoCs, and so by definition an SoC must use IP for a processor rather than a dedicated processor chip. SoCs attempt to integrate as much as possible into a platform architecture that can be leveraged over enough designs to achieve the sales volume sufficient to pay back the enormous cost of developing the chip.
Like CPUs, ML processors in embedded systems are also logical candidates for inclusion in an SoC. Most of them are built out of pure logic, so there is no obvious technology barrier. As a result, it’s natural that architects would look to pulling them inside the SoC for better performance and lower power.
This makes ML IP different from standards-based IP. “If you take Bluetooth, for example, you’ve got specific companies providing that IP,” said Watson. “And it’s a handful, because the spec is defined, and there’s a bound to the innovation that you can provide. With machine learning, we don’t have that, and everybody thinks they can do it better.”
That’s not to say that standards will not eventually enter the ML arena. “This is an area where standards haven’t been driving the enablement,” Watson said. “This is something that’s hit the ground, and now standards are trying to catch up. Because it’s a disrupter, everybody wants to make sure that when they create IP and standards eventually do get defined, that they are the big player providing either the IP or the dedicated chip.”
SDKs complicate matters
While new hardware ideas for implementing ML are still being churned out at a high rate, the role of the
software stack has become increasingly important — some would say even more important than the hardware itself.
“Software is so complex that the hardware is a small piece of the overall value,” said McCarty.
AMD’s Ni agreed. “This space is all about software tools, and it’s very complex,” he said. “Folks tend to focus too much on the hardware to create IP that’s 10% more efficient. If you ask any AI customer, the biggest reason they’re using Nvidia today is software. Their software is mature and older. Software investments are very often underestimated by new entrants.”
But those investments also are increasingly required by companies buying chips. “Building a chip is never enough,” said Anoop Saha, senior manager for strategy and growth at
Siemens EDA. “You need the software stack on top of it. It’s expensive, and you have to put in a lot of capital before you even see the chip and get revenue.”
As the number of ML chips increases, much of the competition rides on how easy an ML function is to implement for a given piece of hardware. The simpler it is, the more everyday designers can use it. The more one has to rely on specialist data scientists, the less reach the hardware will have. So while the software development kit (SDK) has become an important part of any ML offering, it’s easier to handle when offering a chip. The chip becomes its own standalone world, and the tools can operate independently of other parts of the system it goes into.
It’s not so cut-and-dried with ML as IP, though. A chip provider sells to the system builder, but an IP provider sells to a chip builder, which in turn sells to the system builder. “If you are an IP developer, you don’t know how your customer is going to use your IP,” said Saha. “And your customer probably does not know how it will be used in the market.”
That goes for the SDK, as well. The SoC builder won’t be using the SDK, but the system builder will. That means that the tools must pass through the chip designer to the system builder. Chip-level SDKs can assume a configuration of the silicon, but assuming ML IP is sold with options, an IP-oriented SDK must have an extra layer of flexibility to account for the different possible implementations.
In addition, the system builder will get those tools from the SoC vendor, not directly from the IP provider. That SoC will have its own SDK already, and so it’s only natural that the ML portions of the SDK be included with the rest of the SoC SDK. So the IP provider will need to ensure that its ML SDK has the proper hooks for integration into a larger set of tools. The SoC provider also will need to make the effort to wrap the IP tools into its own set to make it look as seamless as practical. That makes having an SDK that’s easy to integrate almost as important as having silicon IP that’s easy to integrate.
“Let’s create the wrappers so that it looks like it’s from us,” said Watson. “Under the hood, it’s utilizing an SDK from one of these IP providers, but that’s abstracted from the users.”
There are further complications with ML IP. SoC verification means that debug must be thought through. “You have to be sure that if something goes wrong at the customer site, you are able to trace that error into your IP,” said Saha. “Your physical design will become more complex, and you might get an issue with design closure.”
This issue persists even after a chip has been deployed into a system. “Let’s say you see a problem in the field,” he added. “How do you take that error back to your IP? And once you have that error, how do you reproduce that error and fix it?”
Effort, value, and margin
Integrating ML functions can be particularly challenging. SoC builders likely will want to customize bits and pieces of the IP. A shrink-wrapped solution is less likely today simply because there is still so much new that’s appearing. So IP vendors may have a high-touch process that involves working with customers to alter the basic IP.
“If you’re building IP and you have to customize it for every customer, then it’s a losing value proposition and becomes very painful,” noted Saha. “It takes a lot of manual effort to customize it for specific use cases.”
The easier the IP vendor makes it to customize the IP, the less work will be required on each sale and the more scalable the business will be. “You have to build something that is easily customizable and easily split into different architectures, different performance and bandwidth levels so you don’t have to customize it a lot,” he said. “Most importantly, you should be able to build it across different technologies.”
On the plus side, IP can provide more market access at lower risk. “You are targeting different markets,” said Saha. “It’s an easier way to monetize things, it’s less complex, and there is less chance of things going wrong.”
But the issue of margin is more complicated. “The company building and producing a chip will always be able to capture more value purely because they are at the forefront and they can set the prices,” he continued. “For the IP company, it becomes more difficult unless you have something so differentiated that you can charge a premium.”
The size and profile of an ML-solution provider also may impact the chip vs. IP decision. A large company will be able to sell a chip to a broad range of customer, whereas a small company may get better traction by selling IP to a large company, thereby indirectly getting access to their better-established customer base.
“It’s better to try and provide IP into SoC silicon, because it has the biggest reach,” said Watson. “You’re not going to get the same margin, but is it better to get $2 of margin on 10 things, or 50 cents on 100,000 things?”
Opportunities for both chips and IP
Some of the companies that are offering IP are also offering a chip – covering both sides of the opportunity. “I have seen multiple companies doing edge-inference where they were funded to design chips,” said Saha. “But then they decided to sell it as an IP for different SoC vendors.”
And some offerings will be IP only, although those companies will usually have built a test chip in order to verify their design. “Almost everybody who is designing IP has a test chip,” noted Saha.
The economics of chips and IP are different, of course. With new memories, for instance, DRAM and flash raise an almost impenetrable barrier against new entrants for dedicated memory chips. Embedded memory IP, on the other hand, can provide lots of value that DRAM and flash can’t provide.
In a similar manner, ML chips and IP need to be able to justify themselves economically. Of course, in this case, there’s not a highly optimized incumbent being threatened, so the new entrants will be competing with each other rather than a well-entrenched foe. Price-points have not yet been rationalized, and that process may get messy for both chips and IP.
So the industry definitely isn’t going into an all-IP mode. “I see both chips and IP,” said Sam Fuller, senior director of marketing at Flex Logix. “I see lots of both.”
But the appearance of IP as an option also is an indi