Accelerators: Hardware Specialized for Deep Learning

Data Science

This SuperDataScience episode hosted by our Chief Data Scientist, Jon Krohn, and joined by Ron Diamant, is dedicated to the hardware we use to train and run A.I. models (particularly LLMs) such as GPUs, TPUs and AWS’s Trainium and Inferentia chips.

Ron:
• Works at Amazon Web Services (AWS) where he is Chief Architect for their A.I. Accelerator chips, which are designed specifically for training (and making inferences with) deep learning models.
• Holds over 200 patents across a broad range of processing hardware, including security chips, compilers and, of course, A.I. accelerators.
• Has been at AWS for nearly nine years – since the acquisition of the Israeli hardware company Annapurna Labs, where he served as an engineer and project manager.
• Holds a Masters in Electrical Engineering from Technion, the Israel Institute of Technology.

This episode is on the technical side but doesn’t assume any particular hardware expertise. It’s primarily targeted at people who train or deploy machine learning models but might be accessible to a broader range of listeners who are curious about how computer hardware works.

In the episode, Ron details:
• CPUs versus GPUs.
• GPUs versus specialized A.I. Accelerators such as Tensor Processing Units (TPUs) and his own Trainium and Inferentia chips.
• The “AI Flywheel” effect between ML applications and hardware innovations.
• The complex tradeoffs he has to consider when embarking upon a multi-year chip-design project.
• When we get to Large Language Model-scale models with billions of parameters, the various ways we can split up training and inference over our available devices.
• How to get popular ML libraries like PyTorch and TensorFlow to interact optimally with A.I. accelerator chips.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

 

Getting Value From A.I.

In February 2023, our Chief Data Scientist, Jon Krohn, delivered this keynote on “Getting Value from A.I.” to open the second day of Hg Capital’s “Digital Forum” in London.

read full post

The Chinchilla Scaling Laws

The Chinchilla Scaling Laws dictate the amount of training data needed to optimally train a Large Language Model (LLM) of a given size. For Five-Minute Friday, our Chief Data Scientist, Jon Krohn, covers this ratio and the LLMs that have arisen from it.

read full post

StableLM: Open-Source “ChatGPT”-Like LLMs You Can Fit on One GPU

The folks who open-sourced Stable Diffusion have now released “StableLM”, their first Language Models. Pre-trained on an unprecedented amount of data for single-GPU LLMs (1.5 trillion tokens!), these are small but mighty.

read full post