The idea of an all-knowing computer program comes from science fiction and should stay there. Despite the seductive fluency of ChatGPT and other language models, they remain unsuitable as sources of knowledge. We must fight against the instinct to trust a human-sounding machine, argue Emily M. Bender & Chirag Shah.
Decades of science fiction have taught us that a key feature of a high-tech future is computer systems that give us instant access to seemingly limitless collections of knowledge through an interface that takes the form of a friendly (or sometimes sinisterly detached) voice. The early promise of the World Wide Web was that it might be the start of that collection of knowledge. With Meta’s Galactica, OpenAI’s ChatGPT and earlier this year LaMDA from Google, it seems like the friendly language interface is just around the corner, too.
SUGGESTED READING
Why AI must learn to forget
By Ali Boyle
However, we must not mistake a convenient plot device—a means to ensure that characters always have the information the writer needs them to have—for a roadmap to how technology could and should be created in the real world. In fact, large language models like Galactica, ChatGPT and LaMDA are not fit for purpose as information access systems, in two fundamental and independent ways.
First, what they are designed to do is to create coherent-seeming text. They do this by being cleverly built to take in vast quantities of training data and model the ways in which words co-occur across all of that text. The result is systems that can produce text that is very compelling when we as humans make sense of it. But the systems do not have any understanding of what they are producing, any communicative intent, any model of the world, or any ability to be accountable for the truth of what they are saying. This is why, in 2021, one of us (Bender) and her co-authors referred to them as stochastic parrots.
___
Information seeking is more than simply getting answers as quickly as possible
___
Join the conversation