Data Science for Clean Energy, with Emily Pastewka

Data Science

How can data science and machine learning power the transition toward a sustainable global economy? The ML leader (and exceptional communicator of technical concepts!) Emily Pastewka is our Chief Data Scientist, Jon Krohn’s, SuperDataScience guest to fill us in on Green Data Science.

• Leads the data function at Palmetto, a cleantech startup focused on home electrification.
• Prior to Palmetto, spent more than 10 years building consumer data products and solving marketplace problems as a data science and ML leader at huge fast-growing tech companies like Uber and Rent The Runway.
• Holds a Masters degree in Computer Science from Columbia University and undergraduate degrees in Economics and Environmental Policy from Duke.

This episode should be accessible to technical and non-technical folks alike because whenever Emily got technical, she did an exquisite job of explaining the concepts.

In this episode, Emily details:
• How data science and A.I. can make the world greener by shifting us to clean energy.
• The team of people needed to bring cleantech data solutions to life.
• How econometrics plays a key role in nudging consumers toward greener decisions.
• Her top tips for excelling as a data leader.
• What she looks for in the scientists and engineers she hires.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at


Getting Value From A.I.

In February 2023, our Chief Data Scientist, Jon Krohn, delivered this keynote on “Getting Value from A.I.” to open the second day of Hg Capital’s “Digital Forum” in London.

read full post

The Chinchilla Scaling Laws

The Chinchilla Scaling Laws dictate the amount of training data needed to optimally train a Large Language Model (LLM) of a given size. For Five-Minute Friday, our Chief Data Scientist, Jon Krohn, covers this ratio and the LLMs that have arisen from it.

read full post

StableLM: Open-Source “ChatGPT”-Like LLMs You Can Fit on One GPU

The folks who open-sourced Stable Diffusion have now released “StableLM”, their first Language Models. Pre-trained on an unprecedented amount of data for single-GPU LLMs (1.5 trillion tokens!), these are small but mighty.

read full post