Sorry you have to fill in any fake email address to download
Prodigy addresses the processor performance plateau caused by slow wires with architectural innovations that minimize data transmission over the slow wires.
www.tachyum.com
Prodigy addresses the processor performance plateau caused by slow wires with architectural innovations that minimize data transmission over the slow wires.
www.tachyum.com
Prodigy on the Leading-Edge of AI Industry Trends
Prodigy includes a powerful AI subsystem that is driving industry performance and features to enable processing of cutting-edge models with an ever-increasing number of parameters.
Recently, pre-trained transformer-based language models such as BERT and GPT, have shown great improvement in many natural language processing tasks. However, these models contain a large number of parameters. The emergence of even larger and more accurate models such as GPT3 and Megatron, suggest a trend toward large pre-trained transformer models.
Prodigy addresses continuing trends in AI models, the explosion in complexity as demanded by more complex NLP models, and more accurate conversational AI by providing matrix multiplication instructions using low precision data types as well as providing sparse matrix multiplication instructions for compressed deep learning models.
Quantization is an effective technique for reducing the memory footprint and the training and inference time of Neural Networks. The idea behind quantization is to reduce the numerical precision of both the weights and the operations in the network.
Our 8-bit floating point number (FP8) has a (sign, exponent, mantissa) format of (1, 5, 2) bits- where the format is chosen carefully to represent weights, activations, errors, and gradients. 16- bit floating point number (FP16) has a (1, 6, 9) format and is used for GEMM accumulations. Both FP8 and FP16 formats are selected after in-depth studies of the data distributions in networks, in particular the gradient statistics of ResNet18, ResNet101 and SqueezeNet in Cifar100 and ImageNet datasets, focusing on balancing the representation accuracy and dynamic range.