6 Comments
User's avatar
Nathan Lambert's avatar

Is Jonathon’s hair still blue what’s the status

Expand full comment
swyx & Alessio's avatar

see video

things developed since

Expand full comment
Nathan Lambert's avatar

Sad

Expand full comment
Meng Li's avatar

The general large model DBRX used 3,072 H100 GPUs for training, while GPT-5 required about 50,000 H100s. Meta has stated that by the end of 2024, they expect to have computing power equivalent to 600,000 H100 GPUs. The training of Llama-3 involved 49,152 H100 GPUs.

The current demand for computing power in foundational large model training is immense, and the amount of computing power directly influences the level of intelligence.

In the computing power supply market, stability and abundant resource availability are crucial for providing services to a larger number of customers.

Expand full comment
swyx & Alessio's avatar

> while GPT-5 required about 50,000 H100s

ooh, source?

Expand full comment
Meng Li's avatar

This is the data estimated by Elon Musk.

Expand full comment