M

Question

If and when this graph is extended to 10^14 parameter models trained on 10^14 elapsed tokens of similar-quality data, will the 10^14 parameter learning curve have slowed down substantially?

Total Forecasters29
Community Prediction29%

Make a Prediction

50%
community: 29%


Comments

? comments