Noam Chomsky contributed to a recent guest op ed in the New York Times: here.

The gist of the article is that because the generative language models used by the recent batch of AI chatbots is based on machine learning they can't and won't lead to artifical general intelligence.

To this one can compare the goals of the original alchemists:

Chemistry today has not only failed to deliver on either of these goals, but the first is consigned to biology (as living things are above the domain of chemistry) and the second to physics (as atomic nuclei are below the domain of chemistry).

Yet we don't hold these failures against present day chemistry, as a science.

Similarly AI set out almost a century ago, to make computers "think" "better" than humans.

Along the way, we realized that while thinking is a human/animal trait, the informational/logical results of thinking can be achieved by computers via other means.

Computers beat us at chess, then they beat us at every other game, one by one including Jeopardy. It wasn't by thinking, but it wasn't useless and meaningless either.


The second assertion is that generative language models and AI chatbots deal in probabilities and not the certainties we humans have in terms of real grammar and langauge skills. And that the rules of real languages can't be learned from merely analyzing data. In fact, in the the concluding sentence the chatbots are ascribed "linguistic incompetence".

Those who have played with these chatbots can judge the linguistic (in)competence of these systems for themselves. Whether the systems talk nonsense or sense, truth or falsehood, one thing is certain, they do not talk un-grammatically.

Any second language speaker will be floored by the quality of the generated language. And native speakers will start to doubt themselves.

If anything AI chatbots are too good at grammar, and this is why they are so good at gaslighting us. We're not used to "writers" with no common sense writing with the linguistic skill of professional writers.


The final criticism of generative language models and AI chatbots is that these systems are amoral.

This is an interesting criticism. Chomsky when faced with the criticism, during the Vitenam war, that MIT, the institution where he worked at the time, was developing weapons used in the Vietnam war and for other nefarious ends by the US government and defence establishment, maintained it was enough not to be personally involed in developing such weapons (especially offensive weapons).

Chomsky at MIT between the war scientists and the anti war students

On other occasions Chomsky has described technology as basically neutral. It is like a hammer. It doesn't care whether you use it to build a house or whether on torture, to crush someone's skull. The hammer can do either.

Noam Chomsky on Technology and Learning

Now one can ask, can AI chatbots serve a useful purpose? The answer is clearly yes. And they can serve as much of a creative process as a hammer.

If an author can write a book in a third the time or less, with the use of ChatGPT, that is progress. Whether this technology can replace humans as thinkers, goes back to the failure of the alchemists above.

Similar remarks were probably made about steam power, locomotives, cars, power tools, etc. That they can do the nuts and bolts job faster, but they can't conceive the grand vision.

Yet no one questions the practical value of the myriad modern technologies of the last two centuries. Or their contribution to progress both material and moral. Though in all fairness there is a fair bit of hand-wringing over the moral implications of all these wonderful technologies. In my view overmuch criticism and hand-wringing.


To combine points two and three above, we can look at the work of Chomsky's thesis supervisor Zellig Harris.

Both Chomsky and Harris worked in part in the field of machine translation (in part for the defence establishment). But Harris's statement of his methods both in his last book: "A Theory of Language and Information: A Mathematical Approach, and in an approachable series of lectures here: Language and Information, describes an approach that is very well suited for the machine learning implementation of generative language models and AI chatbots.

Unlike Chomsky, Harris believed that there were almost no universal rules of grammar. And that while each language has consistent rules, each one has rules that can be independent of other languages. And I assume, that the similarities are either due to languages being related, or from the constraints imposed by vocalization, sound, hearing, etc.

The gist of Harris's theories is that language can be described as a mathematical system, where words are divided into sets based on how they combine with other words (think nouns, verbs, etc.), and with each word having a probability distribution, which is a function of where it stands in the sentence. And where improbable words carry more meaning.

In the simplest view there are 4 constraints/princples that govern languages:

In short there are rules about how words from different sets can combine to generate all the possible base sentences in a language. For example there is a class of verbs like sleep or fall, that require 1 other word (a subject). That such subjects in general require 0 other words. In other cases some verbs such as "to ascribe" require 2 other words, one (a subject) and another (an object), which itself may require one or more parameters. And so on.

Then that arrangement of words from different sets plus the probability distribution of each specific word as a function of preceding (and following) words determine the meaning.

There are rules that constrain the sentences based on how abstract words and sentences are translated to concrete ones with sounds. And where words or elements with very high probabilities can be ommitted. For example: "the sheep eat grass and the sheep hay", becomes: "the sheep eat grass and they eat hay", becomes: "the sheep eat grass and hay"

And finally there is an ordering imposed on top of the partial ordering since the concrete sentence is delivered in the form of sound, which must hava a specific chronology: beginning and ending and going through the words/elements in a specific order.

This description of language, can be parametrized in large part by using massive amounts of linguistic data and machine learning to infer the probabilities, the word categories, etc.

To quote from the overview of Harris's book above: "Given all the properties noted above, the constraints make it possible to devise a strategy for analyzing each sentence in a language, recognizing the reduction traces and the operators and arguments, from the latest entries back to the ultimate elements. Conversely, the constraints make it possible to synthesize every sentence. For, when the words of each class are specified, with their ranges of meaning or their likelihood in respect to their argument words, the syntactic theory presented here produces actual sentences of the language, with direct indication of their meaning."

When the op ed says that: "Unlike humans, for example, who are endowed with a universal grammar that limits the languages we can learn to those with a certain kind of almost mathematical elegance, these programs learn humanly possible and humanly impossible languages with equal facility.", we can see upon further study that Harris's methods have this mathematical elegance, without the limits of any "universal grammar".

And presumably that any universal grammar unique to humans is a result of the parameters of constraints 3 and 4 above (phonemic reduction and linearization) that are due to the practical specifics of speech production and reception.

This site best viewed with Lynx or Mozilla or Konqueror or any standards compliant browser!

Valid HTML 4.01! This Site is Powered by vi and Powered by Emacs also. The all-powerful ed has also contributed!.