Big A.I. R&D Risks Reap Big Societal Rewards, with Meta’s Dr. Laurens van der Maaten

Data Science

By making big research bets, the prolific Meta Senior Research Director Dr. Laurens van der Maaten has devised or supported countless world-changing machine-learning innovations across healthcare, climate change, privacy and more.

• Is a Senior Research Director at Meta, overseeing swathes of their high-risk, high-reward A.I. projects with application areas as diverse as augmented reality, biological protein synthesis and tackling climate change.
• Developed the “CrypTen” privacy-preserving ML framework.
• Pioneered web-scale weakly supervised training of image-recognition models.
• Along with the iconic Geoff Hinton, created the t-SNE dimensionality reduction technique (this paper alone has been cited over 36,000 times).
• In aggregate, his works have been cited nearly 100,000 times!
• Holds a PhD in machine learning from Tilburg University in the Netherlands.

This SuperDataScience episode hosted by our Chief Data Scientist, Jon Krohn, will probably appeal primarily to hands-on data science practitioners, but there is tons of content in this episode for anyone who’d like to appreciate the state of the art in A.I. across a broad range of socially impactful, super-cool applications.

In this episode, Laurens details:
• How he pioneered learning across billions of weakly labeled images to create a state-of-the-art machine-vision model.
• How A.I. can be applied to the synthesis of new biological proteins with implications for both medicine and agriculture.
• Specific ways A.I. is being used to tackle climate change as well as to simulate wearable materials for enhancing augmented-reality interactivity.
• A library just like PyTorch but where all the computations are encrypted.
• The wide range of applications of his ubiquitous dimensionality-reduction approach.
• His vision for the impact of A.I. on society in the coming decades.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at


Getting Value From A.I.

In February 2023, our Chief Data Scientist, Jon Krohn, delivered this keynote on “Getting Value from A.I.” to open the second day of Hg Capital’s “Digital Forum” in London.

read full post

The Chinchilla Scaling Laws

The Chinchilla Scaling Laws dictate the amount of training data needed to optimally train a Large Language Model (LLM) of a given size. For Five-Minute Friday, our Chief Data Scientist, Jon Krohn, covers this ratio and the LLMs that have arisen from it.

read full post

StableLM: Open-Source “ChatGPT”-Like LLMs You Can Fit on One GPU

The folks who open-sourced Stable Diffusion have now released “StableLM”, their first Language Models. Pre-trained on an unprecedented amount of data for single-GPU LLMs (1.5 trillion tokens!), these are small but mighty.

read full post