Current neural models, such as GPT-3, are facing challenges that were unaffordable few years ago: solving many complex textual inference tasks with a single neural learning method, i.e. transformers.

Such paradigms rely on the core notion of language model something that can be trained to predict simple task (such as word replacement) but are then able to encode implicit linguistic knowledge about coherent sentences, convincing argumentation as well as real world facts. The result is that they can be nicely used as a basis for other finer grain tasks such as translation or question answering according to task specific training, generally called fine tuning.

If the question is: Is GPT-3 already a representative example of a general form of AI, even without fine-tuning, then “No” is likely the answer.

The ability of apply unsupervised language learning algorithms at a massive scale is not to be confused with streaming analytics of other sorts. The basic idea is that while streams of numerical (e.g. sensor) data such as the one that physics traditionally targets can be used to provide accurate predictive models of specific phenomena, natural language is pervasive in different areas of human knowledge. Text driven learning can be perceived as a specific form of learning (e.g. transformer based as in GPT-3) but is about universal phenomena. The process is this resulting in models that provide a general account of knowledge items, principles and general rule of knowledge use that are much closer to what we ask to general AI.

If the question is: Is GPT-3 already a representative example of a general form of AI, even without fine-tuning, then “No” is likely the answer. However, criticisms of the approach are usually too aggressive as they tend to focus on specific weak aspects of the current models that are overemphasized.

As usual, truth lies somewhere in the middle. Notice that GPT-3 like models strongly rely on simple learning mechanism but are NOT SIMPLE machines. Every such approach has a strong structure imposed by the pre-training vs. fine-tuning distinction, by the attention-based constraints as well as by the specific definition of different specific tasks as textual inferences of some sort, able to properly trigger fine-tuning. It is thus quite true that machine learning, and all of its successful achievements, are not magical boxes but always exhibit a strong representational (i.e. cognitive) basis. GPT-3 machines continue such a tradition, and even if they are still on their way to general AI, we can look to them for inspiration to growingly better approximations of human intelligence.