NLP with Transformers, feat. Hugging Face’s Lewis Tunstall

Data Science

Lewis Tunstall — brilliant author of the bestseller “NLP with Transformers” and an ML Engineer at Hugging Face — details how to train and deploy your own LLMs, the race for an open-source ChatGPT, and why RLHF leads to better models.

Dr. Tunstall:
• Is an ML Engineer at Hugging Face, one of the most important companies in data science today because they provide much of the most critical infrastructure for A.I. through open-source projects such as their ubiquitous Transformers library, which has a staggering 100,000 stars on GitHub.
• Is a member of Hugging Face’s prestigious research team, where he is currently focused on bringing us closer to having an open-source equivalent of ChatGPT by building tools that support RLHF (reinforcement learning from human feedback) and large-scale model evaluation.
• Authored “Natural Language Processing with Transformers”, an exceptional bestselling book that was published by O’Reilly last year and covers how to train and deploy Large Language Models (LLMs) using open-source libraries.
• Prior to Hugging Face, was an academic at the University of Bern in Switzerland and held data science roles at several Swiss firms.
• Holds a PhD in theoretical and mathematical physics from Adelaide in Australia.

This SuperDataScience episode hosted by our Chief Data Scientist, Jon Krohn, is definitely on the technical side so will likely appeal most to folks like data scientists and ML engineers, but as usual he made an effort to break down the technical concepts Lewis covered so that anyone who’s keen to be aware of the cutting edge in NLP can follow along.

In the episode, Lewis details:
• What transformers are.
• Why transformers have become the default model architecture in NLP in just a few years.
• How to train NLP models when you have few to no labeled data available.
• How to optimize LLMs for speed when deploying them into production.
• How you can optimally leverage the open-source Hugging Face ecosystem, including their Transformers library and their hub for ML models and data.
• How RLHF aligns LLMs with the outputs users would like.
• How open-source efforts could soon meet or surpass the capabilities of commercial LLMs like ChatGPT.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

Getting Value From A.I.

In February 2023, our Chief Data Scientist, Jon Krohn, delivered this keynote on “Getting Value from A.I.” to open the second day of Hg Capital’s “Digital Forum” in London.

read full post

The Chinchilla Scaling Laws

The Chinchilla Scaling Laws dictate the amount of training data needed to optimally train a Large Language Model (LLM) of a given size. For Five-Minute Friday, our Chief Data Scientist, Jon Krohn, covers this ratio and the LLMs that have arisen from it.

read full post

StableLM: Open-Source “ChatGPT”-Like LLMs You Can Fit on One GPU

The folks who open-sourced Stable Diffusion have now released “StableLM”, their first Language Models. Pre-trained on an unprecedented amount of data for single-GPU LLMs (1.5 trillion tokens!), these are small but mighty.

read full post