Jon’s “Generative A.I. with LLMs” Hands-on Training

Data Science

This SuperDataScience episode hosted by our Chief Data Scientist, Jon Krohn, introduces my two-hour “Generative A.I with LLMs” training, which is packed with hands-on Python demos in Colab notebooks. It details open-source LLM (Hugging Face; PyTorch Lightning) and commercial (OpenAI API) options.

The training is available in its entirety on YouTube and all of the code is in GitHub. He’s disabled all YouTube monetization on this video (indeed, I’ve disabled it on all of my instructional content on YouTube) so that you can fully enjoy it, commercial-free!

Topic Summary

Module 1: Introduction to Large Language Models (LLMs)
– A Brief History of Natural Language Processing (NLP)
– Transformers
– Subword Tokenization
– Autoregressive vs Autoencoding Models
– ELMo, BERT and T5
– The GPT (Generative Pre-trained Transformer) Family
– LLM Application Areas

Module 2: The Breadth of LLM Capabilities
– LLM Playgrounds
– Staggering GPT-Family progress
– Key Updates with GPT-4
– Calling OpenAI APIs, including GPT-4

Module 3: Training and Deploying LLMs
– Hardware Options (e.g., CPU, GPU, TPU, IPU, AWS chips)
– The Hugging Face Transformers Library
– Best Practices for Efficient LLM Training
– Parameter-efficient fine-tuning (PEFT) with low-rank adaptation (LoRA)
– Open-Source Pre-Trained LLMs
– LLM Training with PyTorch Lightning
– Multi-GPU Training
– LLM Deployment Considerations
– Monitoring LLMs in Production

Module 4: Getting Commercial Value from LLMs
– Supporting ML with LLMs
– Tasks that can be Automated
– Tasks that can be Augmented
– Best Practices for Successful A.I. Teams and Projects
– What’s Next for A.I.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

 

Getting Value From A.I.

In February 2023, our Chief Data Scientist, Jon Krohn, delivered this keynote on “Getting Value from A.I.” to open the second day of Hg Capital’s “Digital Forum” in London.

read full post

The Chinchilla Scaling Laws

The Chinchilla Scaling Laws dictate the amount of training data needed to optimally train a Large Language Model (LLM) of a given size. For Five-Minute Friday, our Chief Data Scientist, Jon Krohn, covers this ratio and the LLMs that have arisen from it.

read full post

StableLM: Open-Source “ChatGPT”-Like LLMs You Can Fit on One GPU

The folks who open-sourced Stable Diffusion have now released “StableLM”, their first Language Models. Pre-trained on an unprecedented amount of data for single-GPU LLMs (1.5 trillion tokens!), these are small but mighty.

read full post