Everyone seems to be talking about ChatGPT these days thanks to Microsoft Bing, but given the nature of large language models (LLMs), a gamer would be forgiven for feeling a certain déjà vu.
You see, even though LLMs run on massive cloud servers, they use dedicated GPUs to do all the training they need to run. Typically, this means feeding an absolutely obscene amount of data through neural networks running on an array of GPUs with sophisticated tensor cores, and not only does it require a lot of power, but it also requires many real GPUs to do it at scale.
This sounds a lot like cryptomining but it also isn’t. Cryptomining has nothing to do with machine learning algorithms and, unlike machine learning, the only value of cryptomining is the production of a highly profitable digital product called a token that some people believe is worth something and are therefore willing to spend real money for this.
This led to a crypto bubble that led to a shortage of GPUs for the past two years when cryptominers bought up all Nvidia Ampere graphics cards from 2020 to 2022, leaving gamers out in the cold. That bubble has now burst and the GPU stock has now stabilized.
But with the rise of ChatGPT, are we going to see a repeat of the last two years? It’s unlikely, but also not out of the question.
Your graphics card is not going to lead to big LLMs
While you might think that the best graphics card you can buy might be the kind that machine learning types might want for their setups, you’d be wrong. Unless you’re at a university researching machine learning algorithms, a consumer graphics card won’t be enough to drive the kind of algorithm you need.
Most LLMs and other AI production models that produce images or music actually emphasize the first L: Large. ChatGPT has processed an incredibly large amount of text and a consumer GPU is not really as suitable for this task as industrial-strength GPUs running on server-class infrastructure.
These are the GPUs that will be in high demand, and that’s what got Nvidia so excited about ChatGPT: not that ChatGPT will help people, but that running it will require almost all of Nvidia’s server-grade GPUs, meaning Nvidia is poised to cash in on the ChatGPT hype.
The next ChatGPT will run in the cloud, not on local hardware
Unless you’re Google or Microsoft, you don’t run your own LLM infrastructure. You use someone else’s in the form of cloud services. This means you’re not going to have a bunch of startups out there buying all the graphics cards to develop their own LLMs.
Most likely, we will see LLMaaS or large languages as a service models. You will have Microsoft Azure or Amazon Web Services data centers with huge server farms full of GPUs ready to rent for your machine learning algorithms. This is the kind that startups love. They hate buying equipment that isn’t a ping pong table or a bean bag chair.
This means that as ChatGPT and other AI models proliferate, they are not going to run natively on consumer hardware, even when the people running it are a small team of developers. They will run on server-grade hardware, so no one will come for your graphics card.
The players are not out of the woods yet
So, nothing to worry about? Good…
The thing is, while your RTX 4090 might be safe, the question is how many RTX 5090s will Nvidia make when they only have a limited amount of silicon at their disposal, and using that silicon for server-grade GPUs can be much more profitable than are you using it for a GeForce graphics card?
If there’s anything to fear from the rise of ChatGPT, really, it’s the prospect of fewer consumer GPUs being made because shareholders demand that more server-grade GPUs be produced to maximize profits. This isn’t an idle threat either, as the rules of capitalism are written, companies are often required to do whatever maximizes shareholder returns, and the cloud will always be more profitable than selling graphics cards to gamers.
On the other hand, this is really an Nvidia issue. Team Green may go all-in on server GPUs with reduced consumer graphics card stock, but they’re not the only ones making graphics cards.
AMD RDNA 3 graphics cards just introduced hardware AI, but this is nothing close to the tensor cores on Nvidia cards, making Nvidia the de facto choice for machine learning use. This means that AMD may become the default card manufacturer for gamers, while Nvidia moves on to something else.
It’s certainly possible, and unlike crypto, AMD isn’t likely to be a second tier LLM card that’s still good for LLM if slope get an Nvidia card. AMD really isn’t equipped for machine learning at all, especially not at the level that LLMs require, so AMD isn’t a factor here. That means there will always be consumer-grade graphics cards for gamers out there, and good ones too, there just might not be as many Nvidia cards as there used to be.
Team Green partisans may not like this future, but it is most likely given the rise of ChatGPT.