February 12, 2023
Array

ChatGPT: The Promise, Hype & Concerns

Bappa Sinha

CHATGPT – the AI-powered chatbot – has taken the tech world by storm. Launched as a prototype and made available for public testing two months ago, on November 30, 2022, it has generated quite a buzz. It gathered one million subscribers in less than a week. People worldwide have been amazed and amused at its almost human responses on a wide range of topics.

It has produced poetry, Shakespeare-like prose, software code and medical prescriptions. Teachers and educators have expressed alarm over the use of ChatGPT by students to complete assignments. News articles have excitedly announced that it has passed law, medical and MBA exams (though passing an MBA exam can hardly be taken as a sign of intelligence). Abstracts written by ChatGPT for medical research journals have fooled scientists into believing humans wrote them. Tons of articles have appeared announcing the impending demise of a whole range of professionals, from journalists, writers, and content creators to lawyers, teachers, software programmers and doctors. Companies such as Google and Baidu have felt threatened by ChatGPT and rushed to announce their own AI-powered chatbots.

The core technology behind ChatGPT is an AI model called GPT-3. GPT-3 stands for Generative Pre-Trained Transformer Version 3. The “Generative” qualifier implies that GPT belongs to a class of AI algorithms capable of generating new content such as text, images, audio, video, software code, etc. “Transformer” refers to a new type of AI model first described in a 2017 paper by Google.

Transformer Models learn about the context of words by tracking statistical correlations in sequential data like words in a sentence. They have caused a seismic shift in the field of AI and led to a series of significant advances. Stanford researchers, in a 2021 paper calling these “foundation models”, wrote the “sheer scale and scope of foundation models over the last few years have stretched our imagination of what is possible.”

Currently, the most popular AI models use  “neural networks”, conjuring images of an artificial brain simulated using computers. In reality, even with massive advances in computer hardware and chip density, we are nowhere close to simulating a human brain. Artificial neural networks can be thought of instead as a series of mathematical equations whose “weights” or constants are tweaked to perform logistic regressions effectively. In a way, AI models use “training data” to perform elaborate curve-fitting exercises. Once trained, the equations of the “curves” which fit the training data are then used to predict or classify new data.

Before the transformers, AI models needed to be “trained” by datasets which were labelled by humans. So, as an example, a vision AI model would be trained using millions of images, with humans manually labelling each image as showing a cat, human, mountain or river. This is a very labour-intensive process which limits the data on which the AI can be trained. Transformer models get away from this limitation by what is called unsupervised or self-supervised training, i.e., they don’t need the labelled datasets. This way, they can be trained on the trillions of images and petabytes of text data on the internet. Such AI language models are also called “large language models” due to the sheer volume of data they are trained on. The models develop statistical correlations between words and parts of sentences appearing together.

GPT-3 is a transformer-based, generative, large language model. Once the model is trained, it can predict the next likely word given a sequence of words. It actually predicts a distribution of the next most likely word. The next word is chosen randomly based on this probability distribution. The more data the model is trained on, the more coherent its outputs become to the point where it produces not just grammatically correct passages but passages that sound meaningful. However, that by itself doesn’t produce factually correct or appropriate passages.

GPT-3.5 is a derivative of GPT-3 that uses supervised learning to fine-tune its outputs.  Humans are used to rate and correct outputs made by GPT-3, and this feedback is incorporated back into the model. It has been reported that OpenAI outsourced the work of labelling tens of thousands of text snippets. So, the model gets trained to produce outputs which do not contain obvious falsehoods or inappropriate content. At least as long as humans have rated or corrected the general category of topics that the model is being asked to respond to. ChatGPT is the adaption of GPT-3.5 for chatting with humans.

Given the excitement generated by ChatGPT, it clearly is a significant advance over previous generations of AI-powered chatbots. ChatGPT is just one such transformer model which has been deployed. Google, for example, has built a model called BERT, which is used to understand user queries on google searches since 2019. Such transformer-based models could, in the future, assist in content creation and curation, in search and as research aids, in fields like software programming and even drug discovery through AI-based protein folding simulations. It, however, needs to be understood that we are in the infancy of this new technological leap, and much more research is needed to bring some of these promises and possibilities to fruition.

OpenAI has proclaimed that they have a path to the holy grail of AI – Artificial General Intelligence (AGI), i.e., for machines to develop human-like intelligence. Even though transformer models represent a significant advancement in AI, we should be wary of such tall claims. Many people have reported ChatGPT giving incorrect responses or responding with passages which superficially sound meaningful but are actually gibberish. One doctor reported an instance where while making an impressive medical diagnosis, ChatGPT made an odd claim which seemed incorrect. When he asked for a reference, ChatGPT provided a reference from a reputed journal with authors who had contributed to the journal. The only problem was that the reference simply didn’t exist. ChatGPT had just made up the reference out of thin air. This showcases that it is not just unreliable, but it doesn’t actually “understand” as we humans do. It simply generates content according to its statistical model, which is very good at fooling us into believing that it understands as we humans do. We should guard against the hype of a path to AGI. Our intelligence and instincts are a result of hundreds of millions of years of evolution and more than a hundred thousand years of human societal evolution. It is unlikely that even very powerful machines with sophisticated learning algorithms consuming all the text written by us and images and sounds produced by us would be able to develop understanding or intelligence that is in any way comparable to human intelligence. However, machines can learn to do specific tasks very well using the methods described before.

With this broad understanding, we look into some of the issues with such models. To start with, the power of such models comes from the vast amounts of data these models are trained on and the massive size of their training neural networks. GPT-3, has 175 billion parameters, up from 1.5 billion for GPT-2. Running such massive models requires huge amounts of hardware. OpenAI’s training bed is estimated to have more than 10000 GPUs and 285,000 CPU cores. Microsoft, which invested around 3 Billion USD in OpenAI, boasted that it was one of the largest supercomputer clusters in the world. Much of that funding paid for the cost of using this setup. They are in talks to invest another 10 Billion USD in the company. Such research is out of bounds from most academic institutions and even most companies. Only the biggest digital monopolies in the world, the likes of Google, Microsoft, Amazon, Facebook, etc., or companies funded by them can afford such a price tag and also have access to vast amounts of data from varied sources required for training the model. The development and resulting economic benefits of such future technologies would accrue only to such companies, further enhancing their already sprawling digital monopolies. Not just that, these companies are unlikely to share the data used to train the models or the rules used to filter that data. This raises ethical questions since we know that these models can get biased based on the choice of this data. We have examples in the past where AI models developed racial or gender bias. It is unlikely that these companies and their technology teams have the capability or the inclination to deal with such issues.

Beyond the problem of the training data selection, the models are currently opaque. In the sense that even people working on the models don't fully understand how they work, what the model parameters stand for and what the parameter values correspond to in real life. These are giant statistical regression models based on humungous amounts of data which are allowed to “figure” things out in an unsupervised manner. Also, the models are foundational, as Stanford researchers called them, in the sense that multiple different adaptations and applications could be built off the models for unrelated fields. So, even after being fully developed, while the models may work in a majority of cases, the ways they can cause fatal mistakes in various fields won't even be appreciated by these big companies, let alone be accounted for. And given that most academic institutions would be unable to afford such expensive hardware setups and the companies are unlikely to make these setups available for academic research, it won't be possible for third parties to try out and review such models. It is through such collaboration from peer groups that science and technology have progressed so far. Also, given the great harm social media algorithms have unleashed on societies and democracies with the creation of filter bubbles and the proliferation of fake and hate news, we shudder to think what horrors these new models will unleash on society.

Beyond these high-level concerns, the more immediate concerns deal with the extreme short-term outlook of these tech monopolies and their funded startups like OpenAI.

While these transformer models present a critical leap forward in AI technology, they are still in the research phase. Much more needs to be understood on how they work, and they need many generations of improvements before they can be deployed responsibly. However, we already see efforts to commercially launch ChatGPT immediately. OpenAI has announced a USD 42/month plan for ChatGPT API professional plan. This would open up the ChatGPT model for developers to launch their commercial offerings. To alleviate educators' fear of students using ChatGPT to complete their assignments, OpenAI has already announced a service that would identify ChatGPT-produced writings. After creating the problem of plagiarism in the first place, OpenAI could make a killing by selling the solution for it to schools and universities worldwide!

Given the VC hype and greed, we are already seeing or will see half-baked solutions offering to make customer support operators, writers, artists, content creators, journalists, lawyers, educators and software programmers redundant in the near future. Such offerings will not only bring suffering to people working in these fields in terms of job losses but, given their premature deployment, do a great disservice to the professions. For example, we see journalism already hollowed out worldwide, with news organisations indulging in deep cost-cutting to survive in the digital era. AI-generated news articles and opinion pieces may further exacerbate these trends. Also, with these models superficially generating reasonable content and given the cost-cutting pressures many of these professions face, managers would be tempted to bypass human oversight of AI-generated content resulting in serious mistakes.

There are also ethical concerns around AI-generated art, literature and movies. While the AI produces seemingly novel pieces of art, it, in effect, learns from the artistic style of other artists and then can reproduce the style without copying the content, thereby bypassing charges of plagiarism. But in reality, such art amounts to high-tech plagiarism as the models are incapable of creativity but generate content based on their learned prediction algorithms. So, how should society address such concerns?

To sum it up, while the transformer-based AI models represent a substantial advancement in the field of machine learning, they are still in the research phase, and there are many unanswered questions about the ethics and regulations regarding their use. The unseemly haste and greed shown by their creators and funders have the potential to cause great damage to society. Governments worldwide must act quickly to prevent yet another familiar boom-bust tech cycle with the associated harms caused to society. Due to their foundational nature, government should treat these technologies as public goods. Governments should get involved in setting up public initiatives to fund research and development in these promising technologies so that they can be safely and ethically developed and deployed for humanity's greater good.