Editor’s Note: Four months after the launch of chatbot ChatGPT, OpenAI unveiled its latest artificial intelligence technology, GPT-4, on Tuesday. Oren Etzioni, former CEO of the Allen Institute for AI, technical director at AI2 Incubator, and professor emeritus at the University of Washington, offers his thoughts.
GPT-4 has arrived.
It is substantially more comprehensive and powerful than ChatGPT, which has already taken the world by storm. GPT-4 can diagnose patients, write software, play chess, write articles and more.
Last month, OpenAI CEO Sam Altman he tweeted: “a new version of Moore’s Law that could kick in soon: the amount of intelligence in the universe doubles every 18 months.”
In the coming years, we will see GPT-4 and its ilk impact the information economy, jobs, education, politics, and even our understanding of what it means to be intelligent and creative. Referring to a GPT model as the fuzzy JPEG of the Internet understates both its current capabilities and its future capabilities.
However, it is important to point out that the technology has some limitations inherent in its ‘family DNA’.
GPT-4 has some superficial problems. First, it is limited by an extensive set of man-made “guardrails” that seek to prevent it from being offensive or off the wall. Second, it does not update its knowledge in real time. Third, his knowledge of languages other than English is limited. Fourth, it does not analyze audio or video. Fifth, he still makes arithmetic errors that a calculator would avoid.
However, none of these problems are inherent in the approach. To those who fixate on these, I would say, “don’t bite my finger, look where I’m pointing.” All of these issues will be overcome in GPT-5 or a subsequent release from OpenAI or a competitor.
More challenging is the fact that GPT-4 is still not reliable. Like ChatGPT, it “fakes”, composes events and supports those events with constructed sources. Worse, he does it with the calm confidence of a habitual liar.
Like ChatGPT, it can be inconsistent in its responses when probed with multiple questions on the same topic. That’s because it doesn’t have a set of underlying beliefs and values — instead, it responds to human input based on a fuzzy combination of its training data and its internal, mathematically formulated goal.
For these reasons, he also exhibits pervasive biases—you’d be foolish to trust his answers without careful verification. Human developers who use a GPT-style tool, called GitHub CoPilot, to generate software code snippets carefully review and test the software before incorporating it into their handwritten programs. However, each generation of technology makes fewer mistakes and we can expect this trend to continue.
Because of this rapid progress and unprecedented success, it is important to point out that GPT-4 and the full range of similar AI technologies (sometimes called “foundation models” or “generative AI”) have fundamental limitations that will not be overcome in the foreseeable future. Unlike humans, GPT models do not have a body. Models rely on second-hand information in their input, which may be distorted or incomplete. Unlike humans, GPT models can simulate empathy but not feel it. While simulating empathy has its uses (think of a teenager who needs a shoulder to cry on at 2 a.m. in rural Kansas), it’s not the real thing.
While GPT models may seem infinitely creative and surprising in their answers, they cannot design complex artifacts. Perhaps the easiest way to look at this is to ask the question: What elements of GPT-4 were designed from a generative model? The state-of-the-art in artificial intelligence teaches us that GPT-4 was built by scaling and processing human-designed models and methods, including Google’s BERT and AI2’s ELMo. Steven Wolfram provided an accessible overview of the technology here.
Regardless of the details, it’s clear that the technology is light years away from being able to design itself. Additionally, to design a chatbot, you need to start by formulating the goal, the underlying training data, the technical approach, specific sub-goals, and more. These are places where experimentation and iteration is required.
You also need to acquire the relevant resources, hire the right people and more. Of course, this was all done by the talented folks at OpenAI. As I argued in MIT Technology Review, successfully formulating and executing such efforts remains a distinctly human ability.
More importantly, GPT models are tools that work at our command. Although they are extremely powerful, they are not autonomous. They respond to our orders.
Consider the analogy with self-driving cars. In the coming years, self-driving cars will become more flexible and increasingly safer, but cars will not determine where we drive – that decision rests with humans. Likewise, it is up to us to decide how to use GPT models — to edit or to misinform.
The great Pablo Picasso said: “Computers are useless. They only give you answers.”
While GPT models are by no means useless, we still formulate the fundamental questions and evaluate the answers. That won’t change anytime soon.