Designing Machine Learning Systems with Chip Huyen

Data Science

Mega-bestselling author of the “Designing ML Systems” book, Chip Huyen, joined our Chief Data Scientist, Jon Krohn, to cover her top tips on, well, designing ML systems! …as well as her burgeoning real-time ML startup.

Chip:
• Is Co-Founder of Claypot AI, a platform for real-time machine learning.
• Authored the book “Designing Machine Learning Systems”, which was published by O’Reilly Media and based on the Stanford University course she created and taught on the same topic.
• Also created and taught Stanford’s “TensorFlow for Deep Learning” course.
• Previously worked as ML Engineer at data-centric development platform Snorkel AI and as a Senior Deep Learning Engineer at the chip giant NVIDIA.
• Runs an MLOps community on Discord with over 14k members.
• Her helpful posts have earned her over 160k followers on LinkedIn.

This episode will probably appeal most to technical listeners like data scientists and ML engineers, but anyone involved in (or thinking of being involved in) the deployment of ML into real-life systems will learn a ton.

In this episode, Chip details:
• Her top tips for designing production-ready ML applications.
• Why iteration is key successfully deploying ML models.
• What real-time ML is and the kinds of applications it’s critical for.
• Why Large Language Models like ChatGPT and other GPT series architectures involve limited data science ingenuity but do involve enormous ML engineering challenges.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

 

Getting Value From A.I.

In February 2023, our Chief Data Scientist, Jon Krohn, delivered this keynote on “Getting Value from A.I.” to open the second day of Hg Capital’s “Digital Forum” in London.

read full post

The Chinchilla Scaling Laws

The Chinchilla Scaling Laws dictate the amount of training data needed to optimally train a Large Language Model (LLM) of a given size. For Five-Minute Friday, our Chief Data Scientist, Jon Krohn, covers this ratio and the LLMs that have arisen from it.

read full post

StableLM: Open-Source “ChatGPT”-Like LLMs You Can Fit on One GPU

The folks who open-sourced Stable Diffusion have now released “StableLM”, their first Language Models. Pre-trained on an unprecedented amount of data for single-GPU LLMs (1.5 trillion tokens!), these are small but mighty.

read full post