Jon Krohn on Last Week in AI Episode #130

Data Science

The only podcast our Chief Data Scientist, Jon Krohn, listens to is “Last Week in A.I.”, and on episode #130, he had the pleasure to co-host the show! Every episode surveys the previous week’s biggest A.I. news; the biggest story in this episode being (of course) LLaMA 2.

LLaMA 2 was just released by Meta. Here are the key details:
• Open-source
• Unlike the original LLaMA, can be used commercially (as long as you have fewer than 700 million monthly active users 😂)
• Comes in a range of sizes, from 7 billion (fits on a single GPU) to 70 billion model parameters.
• Offers ChatGPT-level performance on a broad range of natural-language benchmarks and is generally now the leading open-source LLM (except on tasks involving code or math).
• Has double the context window (4k tokens) of the original LLaMA.
• Estimated $25m was invested in it, including extensive safety and alignment testing (probably more extensive than any other open-source LLM).
• Uses a two-stage RLHF (reinforcement learning from human feedback) approach that is key to its outstanding generative capacity.
• A new method called “Ghost Attention” allows it to perform especially well in “multi-turn” (ongoing back and forth) conversation.

Other big news from the past week that we covered includes:
• Elon Musk’s launch of his own A.I. company, xAI
• WormGPT is the LLM of choice for cybercriminals
• LongLLaMA can handle contexts of 256k tokens
• A.I. mapping the “connectome” of the fruit fly’s brain
• A.I. designing new proteins that could transform medicine

Watch here: lastweekinai.com/e/130th/

Getting Value From A.I.

Data Science

In February 2023, our Chief Data Scientist, Jon Krohn, delivered this keynote on “Getting Value from A.I.” to open the second day of Hg Capital’s “Digital Forum” in London.

read full post

The Chinchilla Scaling Laws

Data Science

The Chinchilla Scaling Laws dictate the amount of training data needed to optimally train a Large Language Model (LLM) of a given size. For Five-Minute Friday, our Chief Data Scientist, Jon Krohn, covers this ratio and the LLMs that have arisen from it.

read full post

StableLM: Open-Source “ChatGPT”-Like LLMs You Can Fit on One GPU

Data Science

The folks who open-sourced Stable Diffusion have now released “StableLM”, their first Language Models. Pre-trained on an unprecedented amount of data for single-GPU LLMs (1.5 trillion tokens!), these are small but mighty.

read full post