Real patterns and the structure of language
There’s been a lot of hype recently about the emergence of technologies like ChatGPT and the effects they will have on science and society. Linguists have been especially curious about what highly successful large language models (LLMs) mean for their business. Are these models unearthing the hidden structure of language itself or just marking associations for predictive purposes?
In order to answer these sorts of questions we need to delve into the philosophy of what language is. For instance, if Language (with a big “L”) is an emergent human phenomenon arising from our communicative endeavours, i.e. a social entity, then AI is still some ways off approaching it in a meaningful way. If Chomsky, and those who follow his work, are correct that language is a modular mental system innately given to human infants and activated by miniscule amounts of external stimulus, then AI is again unlikely to be linguistic, since most of our most impressive LLMs are sucking up so many resources (both in terms of data and energy) that they are far from this childish learning target. On the third hand, if languages are just very large (possibly infinite) collections of sentences produced by applying discrete rules, then AI could be super-linguistic.
In my new book, I attempt to find a middle ground or intersection between these views. I start with an ontological picture (meaning a picture of what there is “out there”) advocated in the early nineties by the prominent philosopher and cognitive scientist, Daniel Dennett. He draws from information theory to distinguish between noise and patterns. In the noise, nothing is predictable, he says. But more often than not, we can and do find regularities in large data structures. These regularities provide us with the first steps towards pattern recognition. Another way to put this is that if you want to send a message and you need the entire series (string or bitmap) of information to do so, then it’s random. But if there’s some way to compress the information, it’s a pattern! What makes a pattern real, is whether or not it needs an observer for its existence. Dennett uses this view to make a case for “mild realism” about the mind and the position (which he calls the “intentional stance”) we use to identify minds in other humans, non-humans, and even artifacts. Basically, it’s like a theory we use to predict behaviour based on the success of our “minded” vocabulary comprising beliefs, desires, thoughts, etc. For Dennett, prediction matters theoretically!
If it’s not super clear yet, consider a barcode. At first blush, the black lines of varying length set to a background of white might seem random. But the lines (and spaces) can be set at regular intervals to reveal an underlying pattern that can be used to encode information (about the labelled entity/product). Barcodes are unique patterns, i.e. representations of the data from which more information can be drawn (by the way Nature produces these kinds of patterns too in fractal formation).
“The methodological chasm between theoretical and computational linguistics can be surmounted.”
I adapt this idea in two ways in light of recent advances in computational linguistics and AI. The first reinterprets grammars, specifically discrete grammars of theoretical linguistics, as compression algorithms. So, in essence, a language is like a real pattern. Our grammars are collections of rules that compress these patterns. In English, noticing that a sentence is made up of a noun phrase and verb phrase is such a compression. More complex rules capture more complex patterns. Secondly, discrete rules are just a subset of continuous processes. In other words, at one level information theory looks very statistical while generative grammar looks very categorical. But the latter is a special case of the former. I show in the book how some of the foundational theorems of information theory can be translated to discrete grammar representations. So there’s no need to banish the kinds of (stochastic) processes often used and manipulated in computational linguistics, as many theoretical linguists have been wont to do in the past.
This just means that the methodological chasm between theoretical and computational linguistics, which has often served to close the lines of communication between the fields, can be surmounted. Ontologically speaking, languages are not collections of sentences, minimal mental structures, or social entities by themselves. They are informational states taken from complex interactions of all of the above and more (like the environment). On this view, linguistics quickly emerges as a complexity science in which the tools of linguistic grammars, LLMs, and sociolinguistic observations all find a homogeneous home. Recent work on complex systems, especially in biological systems theory, has breathed new life into this interdisciplinary field of inquiry. I argue that the study of language, including the inner workings of both the human mind and ChatGPT, belong within this growing framework.
For decades, computational and theoretical linguists have been talking different languages. The shocking syntactic successes of modern LLMs and ChatGPT have forced them into the same room. Realising that languages are real patterns emerging from biological systems gets someone to break the awkward silence…
Featured image by Google DeepMind Via Unsplash (public domain)
Real patterns and the structure of language
There’s been a lot of hype recently about the emergence of technologies like ChatGPT and the effects they will have on science and society. Linguists have been especially curious about what highly successful large language models (LLMs) mean for their business. Are these models unearthing the hidden structure of language itself or just marking associations for predictive purposes?
In order to answer these sorts of questions we need to delve into the philosophy of what language is. For instance, if Language (with a big “L”) is an emergent human phenomenon arising from our communicative endeavours, i.e. a social entity, then AI is still some ways off approaching it in a meaningful way. If Chomsky, and those who follow his work, are correct that language is a modular mental system innately given to human infants and activated by miniscule amounts of external stimulus, then AI is again unlikely to be linguistic, since most of our most impressive LLMs are sucking up so many resources (both in terms of data and energy) that they are far from this childish learning target. On the third hand, if languages are just very large (possibly infinite) collections of sentences produced by applying discrete rules, then AI could be super-linguistic.
In my new book, I attempt to find a middle ground or intersection between these views. I start with an ontological picture (meaning a picture of what there is “out there”) advocated in the early nineties by the prominent philosopher and cognitive scientist, Daniel Dennett. He draws from information theory to distinguish between noise and patterns. In the noise, nothing is predictable, he says. But more often than not, we can and do find regularities in large data structures. These regularities provide us with the first steps towards pattern recognition. Another way to put this is that if you want to send a message and you need the entire series (string or bitmap) of information to do so, then it’s random. But if there’s some way to compress the information, it’s a pattern! What makes a pattern real, is whether or not it needs an observer for its existence. Dennett uses this view to make a case for “mild realism” about the mind and the position (which he calls the “intentional stance”) we use to identify minds in other humans, non-humans, and even artifacts. Basically, it’s like a theory we use to predict behaviour based on the success of our “minded” vocabulary comprising beliefs, desires, thoughts, etc. For Dennett, prediction matters theoretically!
If it’s not super clear yet, consider a barcode. At first blush, the black lines of varying length set to a background of white might seem random. But the lines (and spaces) can be set at regular intervals to reveal an underlying pattern that can be used to encode information (about the labelled entity/product). Barcodes are unique patterns, i.e. representations of the data from which more information can be drawn (by the way Nature produces these kinds of patterns too in fractal formation).
“The methodological chasm between theoretical and computational linguistics can be surmounted.”
I adapt this idea in two ways in light of recent advances in computational linguistics and AI. The first reinterprets grammars, specifically discrete grammars of theoretical linguistics, as compression algorithms. So, in essence, a language is like a real pattern. Our grammars are collections of rules that compress these patterns. In English, noticing that a sentence is made up of a noun phrase and verb phrase is such a compression. More complex rules capture more complex patterns. Secondly, discrete rules are just a subset of continuous processes. In other words, at one level information theory looks very statistical while generative grammar looks very categorical. But the latter is a special case of the former. I show in the book how some of the foundational theorems of information theory can be translated to discrete grammar representations. So there’s no need to banish the kinds of (stochastic) processes often used and manipulated in computational linguistics, as many theoretical linguists have been wont to do in the past.
This just means that the methodological chasm between theoretical and computational linguistics, which has often served to close the lines of communication between the fields, can be surmounted. Ontologically speaking, languages are not collections of sentences, minimal mental structures, or social entities by themselves. They are informational states taken from complex interactions of all of the above and more (like the environment). On this view, linguistics quickly emerges as a complexity science in which the tools of linguistic grammars, LLMs, and sociolinguistic observations all find a homogeneous home. Recent work on complex systems, especially in biological systems theory, has breathed new life into this interdisciplinary field of inquiry. I argue that the study of language, including the inner workings of both the human mind and ChatGPT, belong within this growing framework.
For decades, computational and theoretical linguists have been talking different languages. The shocking syntactic successes of modern LLMs and ChatGPT have forced them into the same room. Realising that languages are real patterns emerging from biological systems gets someone to break the awkward silence…
Featured image by Google DeepMind Via Unsplash (public domain)
I recently received a note from Prof. Nirmalya Chakraborty (Rabindra Bharati University) about an exciting new digital library. It includes three categories: Navya-Nyāya Scholarship in Nabadwip, Philosophers of Modern India, and Twentieth Century Paṇḍitas of Kolkata. You can find the site here: https://darshanmanisha.org
You can learn more about the project from the following announcement.
Anouncement
Introducing the Digital Library Project
By
Bhaktivedanta Research Center, Kolkata, India
Right before the introduction of English education in India, a new style of philosophising emerged, especially in Bengal, known as Navya-Nyāya. Since Nabadwip was one of the main centres of Navya-Nyāya scholarship in Bengal during 15th– 17th Century, many important works on Navya-Nyāya were written during this period by Nabadwip scholars. Some of these were published later, but many of these published works are not available now. The few copies which are available are also not in good condition. These are the works where Bengal’s intellectual contribution shines forth. We have digitized some of these materials and have uploaded these in the present digital platform.
As a lineage of this Nabadwip tradition, many pandits (traditional scholars) produced many important philosophical works, some in Sanskrit and most in Bengali, who were residents of Kolkata during early nineteenth and twentieth century. Most of these works were published in early 1900 from Kolkata and some from neighbouring cities. These works brought in a kind of Renaissance in reviving classical Indian philosophical deliberations in Bengal. Attempts have been made to upload these books and articles in the present digital platform.
With the introduction of colonial education, a group of philosophers got trained in European philosophy and tried to interpret insights from Classical Indian Philosophy in new light. Kolkata was one of the main centres of this cosmopolitan philosophical scholarship. The works of many of these philosophers from Kolkata were published in early/middle of twentieth century. These philosophers are the true representatives of twentieth century Indian philosophy. Efforts have been made to upload these works in the present digital platform.
The purpose of constructing the present digital platform is to enable the researchers to have access to these philosophical works with the hope that the philosophical contributions of these philosophers will be studied and critically assessed resulting in the enrichment of philosophical repertoire.
We take this opportunity to appeal to fellow scholars to enrich this digital library by lending us their personal collection related to these areas for digitization.
The website address of the Digital Library is: www.darshanmanisha.org
For further correspondence, please write to:
Continuing to discuss On Certainty, we get deeply into textual quotes.
How does he actually respond to Moore's argument about his hand? How does he extend his account to talk about mathematical and scientific statements? Is Wittgenstein a pragmatist?
The post Ep. 309: Wittgenstein On Certainty (Part Two) first appeared on The Partially Examined Life Philosophy Podcast.Discussing the notes Ludwig Wittgenstein made at the end of his life in 1951 that were published as On Certainty in 1969.
Can we coherently doubt propositions like "physical objects exist," "the world is more than 50 years old," and "this is my hand"? Wittgenstein looks at these questions via his framework of language games. Is doubting one of these a legitimate move in a game?
Check out the Overthink podcast and Conversations with Coleman. Attend our live show in NYC on April 15.
The post Ep. 309: Wittgenstein On Certainty (Part One) first appeared on The Partially Examined Life Philosophy Podcast.Readers of the Indian Philosophy Blog may be interested to learn about a new article in the latest issue of the Journal of World Philosophies: “Pramāṇavāda and the Crisis of Skepticism in the Modern Public Sphere” by Amy Donahue (Kennesaw State University). The journal is open-access, and you can download the article here.
Here’s the abstract:
There is widespread and warranted skepticism about the usefulness of inclusive and epistemically rigorous public debate in societies that are modeled on the Habermasian public sphere, and this skepticism challenges the democratic form of government worldwide. To address structural weaknesses of Habermasian public spheres, such as susceptibility to mass manipulation through “ready-to-think” messages and tendencies to privilege and subordinate perspectives arbitrarily, interdisciplinary scholars should attend to traditions of knowledge and public debate that are not rooted in western colonial/modern genealogies, such as the Sanskritic traditions of pramāṇavāda and vāda. Attention to vāda, pramāṇavāda, and other traditions like them can inspire new forms of social discussion, media, and digital humanities, which, in turn, can help to place trust in democracy on foundations that are more stable than mere (anxious) optimism.
I enjoyed reading the article, and I found it extremely thought-provoking. I hope readers of this blog will check it out. Also, be sure to look for the forthcoming online debate platform that Donahue mentions on p. 5! Maybe we’ll make an announcement on the blog when it’s ready. Or reach out to Dr. Donahue if you’re interested in collaborating.
Here are a few of my questions for further discussion:
My questions here are meant to be taken in the spirit of vāda to keep the conversation going. I hope others will read Donahue’s thought-provoking article and join this worthwhile conversation.
Also, if you will be attending the upcoming Central APA Conference in Denver, Colorado, USA on Feb. 22, 2023, you will have the chance to discuss these and other issues in person!
Wed. Feb. 22, 2023, 1-4pm
2022 Invited Symposium: Vāda: Indian Logic and Public Debate
Chair: Jarrod Brown (Berea College)
Speakers:
Amy Donahue (Kennesaw State University) “Vāda Project: A Non-Centric Method for Countering Disinformation”
Arindam Chakrabarti (University of Hawai’i at Manoa) “Does the Question Arise? Questioning the Meaning of Questions and the Definability of Doubt”
Ethan Mills (University of Tennessee at Chattanooga) “Cārvāka Skepticism about Inference: Historical and Contemporary Examples”
(More information about the conference here, including a draft program that includes several other panels on Indian philosophy.)
Works Cited
Donahue, Amy. 2022. “Pramāṇavāda and the Crisis of Skepticism in the Public Sphere.” Journal of World Philosophies 7 (Winter 2022): 1-14.
Matilal, Bimal Krishna. 1998. The Character of Logic in India. Edited by Jonardon Ganeri and Heeraman Tiwari. Albany: SUNY Press.