Observing LLMs in Production to Automatically Catch Issues

Data Science

In this SuperDataScience episode hosted by our Chief Data Scientist, Jon Krohn, he is joined by Amber Roberts and Xander Song who provide a technical deep dive into the major challenges (such as drift) that A.I. systems (particularly LLMs) face in production. They also detail solutions, such as open-source ML Observability tools.

Both Amber and Xander work at Arize AI, an ML observability platform that has raised over $60m in venture capital.

Amber:
• Serves as an ML Growth Lead at Arize, where she has also been an ML engineer.
• Prior to Arize, worked as an AI/ML product manager at Splunk and as the head of A.I. at Insight Data Science.
• Holds a Masters in Astrophysics from the Universidad de Chile in South America.

Xander:
• Serves as a developer advocate at Arize, specializing in their open-source projects.
• Prior to Arize, he spent three years as an ML engineer.
• Holds a Bachelors in Mathematics from UC Santa Barbara as well as a BA in Philosophy from the University of California, Berkeley.

This episode will appeal primarily to technical folks like data scientists and ML engineers, but they made an effort to break down technical concepts so that it’s accessible to anyone who’d like to understand the major issues that A.I. systems can develop once they’re in production as well as how to overcome these issues.

In the episode, Amber and Xander detail:
• The kinds of drift that can adversely impact a production A.I. system, with a particular focus on the issues that can affect Large Language Models (LLMs).
• What ML Observability is and how it builds upon ML Monitoring to automate the discovery and resolution of production A.I. issues.
• Open-source ML Observability options.
• How frequently production models should be retrained.
• How ML Observability relates to discovering model biases against particular demographic groups.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

 

Getting Value From A.I.

In February 2023, our Chief Data Scientist, Jon Krohn, delivered this keynote on “Getting Value from A.I.” to open the second day of Hg Capital’s “Digital Forum” in London.

read full post

The Chinchilla Scaling Laws

The Chinchilla Scaling Laws dictate the amount of training data needed to optimally train a Large Language Model (LLM) of a given size. For Five-Minute Friday, our Chief Data Scientist, Jon Krohn, covers this ratio and the LLMs that have arisen from it.

read full post

StableLM: Open-Source “ChatGPT”-Like LLMs You Can Fit on One GPU

The folks who open-sourced Stable Diffusion have now released “StableLM”, their first Language Models. Pre-trained on an unprecedented amount of data for single-GPU LLMs (1.5 trillion tokens!), these are small but mighty.

read full post