~hyperlinks

Aren't researchers trying out new methodologies on smaller models literally every day?

kilianbuhn

Training AI models can be “bursty,” meaning that there can be sudden spikes in GPU usage followed by periods of lower activity when researchers analyze the results and decide what to do next. This leads to what researchers refer to as a lower utilization rate, meaning they aren’t getting the most bang for their GPU buck.

Ooo, as a researcher i definitely feel this too.  90% of the time, the bottleneck is me... the speed of my attention and decision-making.  Only 10% of the time is the bottleneck the speed of compute.

tech

xAI’s GPU fleet is running at about 11% utilization

zuspotirko

> Training AI models can be “bursty,” meaning that there can be sudden spikes in GPU usage followed by periods of lower activity when researchers analyze the results and decide what to do next. This leads to what researchers refer to as a lower utilization rate, meaning they aren’t getting the most bang for their GPU buck.

Ooo, as a researcher i definitely feel this too.  90% of the time, the bottleneck is me... the speed of my attention and decision-making.  Only 10% of the time is the bottleneck the speed of compute.