Models have become more complex as capabilities and accuracy have increased.
www.forbes.com
Can AI Continue To Scale?
Peter van der Made
Forbes Councils Member
Forbes Technology Council
COUNCIL POST| Membership (Fee-Based)
Aug 2, 2023,09:30am EDT
Peter van der Made is the founder and CTO of
BrainChip Ltd. BrainChip produces advanced AI processors in digital neuromorphic technologies.
GETTY
Artificial intelligence is rapidly being deployed within all aspects of business and finance. Some exciting successes are putting pressure on the industry to embrace this new technology. No one wants to be left behind.
The core technologies behind AI are neural network models, deep learning algorithms and massive data sets for training. The model is constructed for a specific purpose such as object recognition, speech recognition and object tracking. A “model” describes how the neural network is constructed, how many parameters the network has and how many layers.
The overall accuracy of the neural network is a function of the quality and size of the training data set, the number of parameters and the training procedure. This is not an exact science. Too much training, and the model will respond well to the training set but not to real-world situations. This is “overfitting” the model. Too little training, and the model will not be able to respond to all known situations.
No model is perfect. There is always a margin of error and the occurrence of outlier conditions for which the model has no parameters. Over the last 10 years, models have become more complex as capabilities and accuracy have increased.
The models used for large language models such as Bard and GPT-4 use hundreds of billions of parameters and need massive data sets to train on. Even the most powerful personal computers cannot handle large models that require considerable computational power and memory resources. The computing is done via the internet (the cloud) on large data center computers—a server farm.
Server farms are used in applications such as natural language processing, generating text and images, classifying video streams, and IoT process control and monitoring. Wired
estimates that training a large model like GPT-4 costs $100 million, using as many as 10,000 systems with powerful A100 GPU processor arrays over 11 months. The largest known model is Google GLaM, with more than
1 trillion parameters. Models are getting larger and larger, but can these systems continue to scale?
According to SemiAnalysis chief analyst Dylan Patel (via Insider), the cost of running ChatGPT is estimated to be as high as
$700,000 daily. This cost is broken down into maintenance, depreciation on the computer resources, and electricity consumption of the servers and cooling systems. In a study published jointly by Google and UC Berkeley (via Scientific American), the amount of power used by GPT-3 is
1,287 megawatt hours.
This is of great concern when multiplied by the number of server farms worldwide and the increase in AI processing. The power consumption of server farms will likely increase as more people start to access online AI. Server farms could consume
more than 20% of the world’s electricity by 2025.
Server farms use large racks with powerful computers and GPUs. They contain thousands of processing cores that can be used as parallel processing units to compute the function of a neural network. The power used by a single GPU can be as high as 400 watts, and a server may use up to 32 of those GPUs. A company’s cluster of large data centers may deploy as many as
2.5 million servers. Even if only half of the servers contain GPUs, a worst-case calculation will reach 16,000 megawatt hours. That is a lot of greenhouse gases.
There are several ways to reduce the environmental impact of server farms. One part of the solution is more efficient hardware, together with the use of renewable energy. Another is to use hybrid solutions that perform much of the processing distributed at the edge in specialized, low-power but high-performance neuromorphic hardware. Neuromorphic processing takes inspiration from the energy-efficient methods of the brain.
The human brain contains approximately
86 billion neuron cells (about 80 times that of GLaM, the largest of the large language models) with an estimated 100 trillion connections (roughly 100 times that of GLaM). Each cell has a variable amount of electrochemical memory. The information stored in this biological memory can be considered equivalent to the parameters in a neural network model.
The brain model is dynamic in contrast to artificial neural networks. It creates new connections and more memory as we learn, and it prunes redundant connections when we sleep. The human brain neural network, even though larger than the largest AI model, consumes only the energy equivalent of 20 watts—less than a light bulb. The brain’s structure is vastly different from the neural network models used in today’s AI systems, notwithstanding the successes we have seen over the last few years.
Neuromorphic processing borrows from the efficient processing techniques of the brain by copying its behavior into digital circuits. While digital circuits may not be as power-efficient as analog circuits, stability, interchangeability and speed outweigh the slight power advantage. Using neuromorphic computing engines is transparent to the developer and the user because of an event-driven convolution shell.
Neuromorphic processing can run convolutional neural networks (CNN) and can run image classification on ImageNet1000, real-time video classification, odor and taste recognition, vibration analysis, voice and speech recognition, and disease and anomaly detection. Using these functions in portable and battery-powered tools is possible because of its low power consumption.
It is possible to reduce the excessive power consumption of data centers by using distributed AI processing in fast neuromorphic computing devices, which reduces operating costs and increases the functionality and responsiveness of edge products. Neuromorphic processing can help compensate for AI’s expected negative environmental impact.
Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives.
Do I qualify?
Follow me on
LinkedIn. Check out my
website.
Peter van der Made
Peter van der Made is the founder and CTO of
BrainChip Ltd. BrainChip produces advanced AI processors in digital neuromorphic technologies. Read Peter van der Made's full executive profile
here.