How to Be Both Socially Impactful and Financially Successful in Your Data Career

Data Science

Josh Wills, the real-life data science superhero — decarbonizer of the transport system, full-time modeler of the Covid pandemic, and force of nature at fast-growing tech firms — is our Chief Data Scientist, Jon Krohn’s, extraordinary guest in episode #665 of the SuperDataScience Podcast.

Josh has done a startling amount in his career:
• Worked to decarbonize transport as a software engineer at WeaveGrid.
• Modeled Covid-19 full-time for the Government of California in early 2020 as the pandemic was first kicking off.
• Was the first Director of Data Engineering at Slack.
• Was Director of Data Science at Cloudera.
• Was Staff Software Engineer at Google.
• Co-authored several editions of O’Reilly Media books on advanced analytics.
• Has given countless thought-provoking (and very funny!) talks at major data science conferences.
• And now describes himself as a “gainfully unemployed data person” as he contributes to open-source software projects and develops his “Data Engineering for Machine Learning” course.

This episode will appeal most to technical listeners that are keen to be outstanding data scientists or software engineers, particularly through engineering scalable ML products.

However, much of the content in the episode will appeal to anyone who’d like to hear from a brilliant, thoughtful, and seasoned professional who goes into depth on:
• The orders-of-magnitude more efficient “contextual bandit” approach to testing models in production.
• How to avoid the “infinite loop of sadness” in data product development.
• The pros and cons of choosing a management-track career path relative to an independent contributor path.
• What it’s like to be called on as a life-saving data-science superhero during a catastrophic global event.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at


Getting Value From A.I.

In February 2023, our Chief Data Scientist, Jon Krohn, delivered this keynote on “Getting Value from A.I.” to open the second day of Hg Capital’s “Digital Forum” in London.

read full post

The Chinchilla Scaling Laws

The Chinchilla Scaling Laws dictate the amount of training data needed to optimally train a Large Language Model (LLM) of a given size. For Five-Minute Friday, our Chief Data Scientist, Jon Krohn, covers this ratio and the LLMs that have arisen from it.

read full post

StableLM: Open-Source “ChatGPT”-Like LLMs You Can Fit on One GPU

The folks who open-sourced Stable Diffusion have now released “StableLM”, their first Language Models. Pre-trained on an unprecedented amount of data for single-GPU LLMs (1.5 trillion tokens!), these are small but mighty.

read full post