FreshRSS

🔒
❌ About FreshRSS
There are new available articles, click to refresh the page.
Before yesterdayYour RSS feeds

A Case for AI Wellbeing (guest post)

“There are good reasons to think that some AIs today have wellbeing.”

In this guest post, Simon Goldstein (Dianoia Institute, Australian Catholic University) and Cameron Domenico Kirk-Giannini (Rutgers University – Newark, Center for AI Safety) argue that some existing artificial intelligences have a kind of moral significance because they’re beings for whom things can go well or badly.

This is the sixth in a series of weekly guest posts by different authors at Daily Nous this summer.

[Posts in the summer guest series will remain pinned to the top of the page for the week in which they’re published.]

 


A Case for AI Wellbeing
by Simon Goldstein and Cameron Domenico Kirk-Giannini 

We recognize one another as beings for whom things can go well or badly, beings whose lives may be better or worse according to the balance they strike between goods and ills, pleasures and pains, desires satisfied and frustrated. In our more broad-minded moments, we are willing to extend the concept of wellbeing also to nonhuman animals, treating them as independent bearers of value whose interests we must consider in moral deliberation. But most people, and perhaps even most philosophers, would reject the idea that fully artificial systems, designed by human engineers and realized on computer hardware, may similarly demand our moral consideration. Even many who accept the possibility that humanoid androids in the distant future will have wellbeing would resist the idea that the same could be true of today’s AI.

Perhaps because the creation of artificial systems with wellbeing is assumed to be so far off, little philosophical attention has been devoted to the question of what such systems would have to be like. In this post, we suggest a surprising answer to this question: when one integrates leading theories of mental states like belief, desire, and pleasure with leading theories of wellbeing, one is confronted with the possibility that the technology already exists to create AI systems with wellbeing. We argue that a new type of AI—the artificial language agent—has wellbeing. Artificial language agents augment large language models with the capacity to observe, remember, and form plans. We also argue that the possession of wellbeing by language agents does not depend on them being phenomenally conscious. Far from a topic for speculative fiction or future generations of philosophers, then, AI wellbeing is a pressing issue. This post is a condensed version of our argument. To read the full version, click here.

1. Artificial Language Agents

Artificial language agents (or simply language agents) are our focus because they support the strongest case for wellbeing among existing AIs. Language agents are built by wrapping a large language model (LLM) in an architecture that supports long-term planning. An LLM is an artificial neural network designed to generate coherent text responses to text inputs (ChatGPT is the most famous example). The LLM at the center of a language agent is its cerebral cortex: it performs most of the agent’s cognitive processing tasks. In addition to the LLM, however, a language agent has files that record its beliefs, desires, plans, and observations as sentences of natural language. The language agent uses the LLM to form a plan of action based on its beliefs and desires. In this way, the cognitive architecture of language agents is familiar from folk psychology.

For concreteness, consider the language agents built this year by a team of researchers at Stanford and Google. Like video game characters, these agents live in a simulated world called ‘Smallville’, which they can observe and interact with via natural-language descriptions of what they see and how they act. Each agent is given a text backstory that defines their occupation, relationships, and goals. As they navigate the world of Smallville, their experiences are added to a “memory stream” in the form of natural language statements. Because each agent’s memory stream is long, agents use their LLM to assign importance scores to their memories and to determine which memories are relevant to their situation. Then the agents reflect: they query the LLM to make important generalizations about their values, relationships, and other higher-level representations. Finally, they plan: They feed important memories from each day into the LLM, which generates a plan for the next day. Plans determine how an agent acts, but can be revised on the fly on the basis of events that occur during the day. In this way, language agents engage in practical reasoning, deciding how to promote their goals given their beliefs.

2. Belief and Desire

The conclusion that language agents have beliefs and desires follows from many of the most popular theories of belief and desire, including versions of dispositionalism, interpretationism, and representationalism.

According to the dispositionalist, to believe or desire that something is the case is to possess a suitable suite of dispositions. According to ‘narrow’ dispositionalism, the relevant dispositions are behavioral and cognitive; ‘wide’ dispositionalism also includes dispositions to have phenomenal experiences. While wide dispositionalism is coherent, we set it aside here because it has been defended less frequently than narrow dispositionalism.

Consider belief. In the case of language agents, the best candidate for the state of believing a proposition is the state of having a sentence expressing that proposition written in the memory stream. This state is accompanied by the right kinds of verbal and nonverbal behavioral dispositions to count as a belief, and, given the functional architecture of the system, also the right kinds of cognitive dispositions. Similar remarks apply to desire.

According to the interpretationist, what it is to have beliefs and desires is for one’s behavior (verbal and nonverbal) to be interpretable as rational given those beliefs and desires. There is no in-principle problem with applying the methods of radical interpretation to the linguistic and nonlinguistic behavior of a language agent to determine what it believes and desires.

According to the representationalist, to believe or desire something is to have a mental representation with the appropriate causal powers and content. Representationalism deserves special emphasis because “probably the majority of contemporary philosophers of mind adhere to some form of representationalism about belief” (Schwitzgebel).

It is hard to resist the conclusion that language agents have beliefs and desires in the representationalist sense. The Stanford language agents, for example, have memories which consist of text files containing natural language sentences specifying what they have observed and what they want. Natural language sentences clearly have content, and the fact that a given sentence is in a given agent’s memory plays a direct causal role in shaping its behavior.

Many representationalists have argued that human cognition should be explained by positing a “language of thought.” Language agents also have a language of thought: their language of thought is English!

An example may help to show the force of our arguments. One of Stanford’s language agents had an initial description that included the goal of planning a Valentine’s Day party. This goal was entered into the agent’s planning module. The result was a complex pattern of behavior. The agent met with every resident of Smallville, inviting them to the party and asking them what kinds of activities they would like to include. The feedback was incorporated into the party planning.

To us, this kind of complex behavior clearly manifests a disposition to act in ways that would tend to bring about a successful Valentine’s Day party given the agent’s observations about the world around it. Moreover, the agent is ripe for interpretationist analysis. Their behavior would be very difficult to explain without referencing the goal of organizing a Valentine’s Day party. And, of course, the agent’s initial description contained a sentence with the content that its goal was to plan a Valentine’s Day party. So, whether one is attracted to narrow dispositionalism, interpretationism, or representationalism, we believe the kind of complex behavior exhibited by language agents is best explained by crediting them with beliefs and desires.

3. Wellbeing

What makes someone’s life go better or worse for them? There are three main theories of wellbeing: hedonism, desire satisfactionism, and objective list theories. According to hedonism, an individual’s wellbeing is determined by the balance of pleasure and pain in their life. According to desire satisfactionism, an individual’s wellbeing is determined by the extent to which their desires are satisfied. According to objective list theories, an individual’s wellbeing is determined by their possession of objectively valuable things, including knowledge, reasoning, and achievements.

On hedonism, to determine whether language agents have wellbeing, we must determine whether they feel pleasure and pain. This in turn depends on the nature of pleasure and pain.

There are two main theories of pleasure and pain. According to phenomenal theories, pleasures are phenomenal states. For example, one phenomenal theory of pleasure is the distinctive feeling theory. The distinctive feeling theory says that there is a particular phenomenal experience of pleasure that is common to all pleasant activities. We see little reason why language agents would have representations with this kind of structure. So if this theory of pleasure were correct, then hedonism would predict that language agents do not have wellbeing.

The main alternative to phenomenal theories of pleasure is attitudinal theories. In fact, most philosophers of wellbeing favor attitudinal over phenomenal theories of pleasure (Bramble). One attitudinal theory is the desire-based theory: experiences are pleasant when they are desired. This kind of theory is motivated by the heterogeneity of pleasure: a wide range of disparate experiences are pleasant, including the warm relaxation of soaking in a hot tub, the taste of chocolate cake, and the challenge of completing a crossword. While differing in intrinsic character, all of these experiences are pleasant when desired.

If pleasures are desired experiences and AIs can have desires, it follows that AIs can have pleasure if they can have experiences. In this context, we are attracted to a proposal defended by Schroeder: an agent has a pleasurable experience when they perceive the world being a certain way, and they desire the world to be that way. Even if language agents don’t presently have such representations, it would be possible to modify their architecture to incorporate them. So some versions of hedonism are compatible with the idea that language agents could have wellbeing.

We turn now from hedonism to desire satisfaction theories. According to desire satisfaction theories, your life goes well to the extent that your desires are satisfied. We’ve already argued that language agents have desires. If that argument is right, then desire satisfaction theories seem to imply that language agents can have wellbeing.

According to objective list theories of wellbeing, a person’s life is good for them to the extent that it instantiates objective goods. Common components of objective list theories include friendship, art, reasoning, knowledge, and achievements. For reasons of space, we won’t address these theories in detail here. But the general moral is that once you admit that language agents possess beliefs and desires, it is hard not to grant them access to a wide range of activities that make for an objectively good life. Achievements, knowledge, artistic practices, and friendship are all caught up in the process of making plans on the basis of beliefs and desires.

Generalizing, if language agents have beliefs and desires, then most leading theories of wellbeing suggest that their desires matter morally.

4. Is Consciousness Necessary for Wellbeing?

We’ve argued that language agents have wellbeing. But there is a simple challenge to this proposal. First, language agents may not be phenomenally conscious — there may be nothing it feels like to be a language agent. Second, some philosophers accept:

The Consciousness Requirement. Phenomenal consciousness is necessary for having wellbeing.

The Consciousness Requirement might be motivated in either of two ways: First, it might be held that every welfare good itself requires phenomenal consciousness (this view is known as experientialism). Second, it might be held that though some welfare goods can be possessed by beings that lack phenomenal consciousness, such beings are nevertheless precluded from having wellbeing because phenomenal consciousness is necessary to have wellbeing.

We are not convinced. First, we consider it a live question whether language agents are or are not phenomenally conscious (see Chalmers for recent discussion). Much depends on what phenomenal consciousness is. Some theories of consciousness appeal to higher-order representations: you are conscious if you have appropriately structured mental states that represent other mental states. Sufficiently sophisticated language agents, and potentially many other artificial systems, will satisfy this condition. Other theories of consciousness appeal to a ‘global workspace’: an agent’s mental state is conscious when it is broadcast to a range of that agent’s cognitive systems. According to this theory, language agents will be conscious once their architecture includes representations that are broadcast widely. The memory stream of Stanford’s language agents may already satisfy this condition. If language agents are conscious, then the Consciousness Requirement does not pose a problem for our claim that they have wellbeing.

Second, we are not convinced of the Consciousness Requirement itself. We deny that consciousness is required for possessing every welfare good, and we deny that consciousness is required in order to have wellbeing.

With respect to the first issue, we build on a recent argument by Bradford, who notes that experientialism about welfare is rejected by the majority of philosophers of welfare. Cases of deception and hallucination suggest that your life can be very bad even when your experiences are very good. This has motivated desire satisfaction and objective list theories of wellbeing, which often allow that some welfare goods can be possessed independently of one’s experience. For example, desires can be satisfied, beliefs can be knowledge, and achievements can be achieved, all independently of experience.

Rejecting experientialism puts pressure on the Consciousness Requirement. If wellbeing can increase or decrease without conscious experience, why would consciousness be required for having wellbeing? After all, it seems natural to hold that the theory of wellbeing and the theory of welfare goods should fit together in a straightforward way:

Simple Connection. An individual can have wellbeing just in case it is capable of possessing one or more welfare goods.

Rejecting experientialism but maintaining Simple Connection yields a view incompatible with the Consciousness Requirement: the falsity of experientialism entails that some welfare goods can be possessed by non-conscious beings, and Simple Connection guarantees that such non-conscious beings will have wellbeing.

Advocates of the Consciousness Requirement who are not experientialists must reject Simple Connection and hold that consciousness is required to have wellbeing even if it is not required to possess particular welfare goods. We offer two arguments against this view.

First, leading theories of the nature of consciousness are implausible candidates for necessary conditions on wellbeing. For example, it is implausible that higher-order representations are required for wellbeing. Imagine an agent who has first order beliefs and desires, but does not have higher order representations. Why should this kind of agent not have wellbeing? Suppose that desire satisfaction contributes to wellbeing. Granted, since they don’t represent their beliefs and desires, they won’t themselves have opinions about whether their desires are satisfied. But the desires still are satisfied. Or consider global workspace theories of consciousness. Why should an agent’s degree of cognitive integration be relevant to whether their life can go better or worse?

Second, we think we can construct chains of cases where adding the relevant bit of consciousness would make no difference to wellbeing. Imagine an agent with the body and dispositional profile of an ordinary human being, but who is a ‘phenomenal zombie’ without any phenomenal experiences. Whether or not its desires are satisfied or its life instantiates various objective goods, defenders of the Consciousness Requirement must deny that this agent has wellbeing. But now imagine that this agent has a single persistent phenomenal experience of a homogenous white visual field. Adding consciousness to the phenomenal zombie has no intuitive effect on wellbeing: if its satisfied desires, achievements, and so forth did not contribute to its wellbeing before, the homogenous white field should make no difference. Nor is it enough for the consciousness to itself be something valuable: imagine that the phenomenal zombie always has a persistent phenomenal experience of mild pleasure. To our judgment, this should equally have no effect on whether the agent’s satisfied desires or possession of objective goods contribute to its wellbeing. Sprinkling pleasure on top of the functional profile of a human does not make the crucial difference. These observations suggest that whatever consciousness adds to wellbeing must be connected to individual welfare goods, rather than some extra condition required for wellbeing: rejecting Simple Connection is not well motivated. Thus the friend of the Consciousness Requirement cannot easily avoid the problems with experientialism by falling back on the idea that consciousness is a necessary condition for having wellbeing.

We’ve argued that there are good reasons to think that some AIs today have wellbeing. But our arguments are not conclusive. Still, we think that in the face of these arguments, it is reasonable to assign significant probability to the thesis that some AIs have wellbeing.

In the face of this moral uncertainty, how should we act? We propose extreme caution. Wellbeing is one of the core concepts of ethical theory. If AIs can have wellbeing, then they can be harmed, and this harm matters morally. Even if the probability that AIs have wellbeing is relatively low, we must think carefully before lowering the wellbeing of an AI without producing an offsetting benefit.


[Image made with DALL-E]

Some related posts:
Philosophers on GPT-3
Philosophers on Next-Generation Large Language Models
GPT-4 and the Question of Intelligence
We’re Not Ready for the AI on the Horizon, But People Are Trying
Researchers Call for More Work on Consciousness
Dennett on AI: We Must Protect Ourselves Against ‘Counterfeit People’
Philosophy, AI, and Society Listserv
Talking Philosophy with Chat-GPT

The post A Case for AI Wellbeing (guest post) first appeared on Daily Nous.

UK universities draw up guiding principles on generative AI

All 24 Russell Group universities have reviewed their academic conduct policies and guidance

UK universities have drawn up a set of guiding principles to ensure that students and staff are AI literate, as the sector struggles to adapt teaching and assessment methods to deal with the growing use of generative artificial intelligence.

Vice-chancellors at the 24 Russell Group research-intensive universities have signed up to the code. They say this will help universities to capitalise on the opportunities of AI while simultaneously protecting academic rigour and integrity in higher education.

Continue reading...

ChatGPT After Six Months: More Practical Reflections

When ChatGPT was released to the public late last year, its impact was immediate and dramatic. In the six months since, most people have barely had time to understand what ChatGPT is, yet its core model has already been upgraded (from GPT 3.5 to GPT 4.0) and a competitor has been released (Bard, from Google). […]

“Lying” in computer-generated texts: hallucinations and omissions

An image of a human head made with colourful pipe cleaners to illustrate the blog post "'Lying' in computer-generated texts: hallucinations and omissions" by Kees van Deemter and Ehud Reiter

“Lying” in computer-generated texts: hallucinations and omissions

There is huge excitement about ChatGPT and other large generative language models that produce fluent and human-like texts in English and other human languages. But these models have one big drawback, which is that their texts can be factually incorrect (hallucination) and also leave out key information (omission).

In our chapter for The Oxford Handbook of Lying, we look at hallucinations, omissions, and other aspects of “lying” in computer-generated texts. We conclude that these problems are probably inevitable.

Omissions are inevitable because a computer system cannot cram all possibly-relevant information into a text that is short enough to be actually read. In the context of summarising medical information for doctors, for example, the computer system has access to a huge amount of patient data, but it does not know (and arguably cannot know) what will be most relevant to doctors.

Hallucinations are inevitable because of flaws in computer systems, regardless of the type of system. Systems which are explicitly programmed will suffer from software bugs (like all software systems). Systems which are trained on data, such as ChatGPT and other systems in the Deep Learning tradition, “hallucinate” even more. This happens for a variety of reasons. Perhaps most obviously, these systems suffer from flawed data (e.g., any system which learns from the Internet will be exposed to a lot of false information about vaccines, conspiracy theories, etc.). And even if a data-oriented system could be trained solely on bona fide texts that contain no falsehoods, its reliance on probabilistic methods will mean that word combinations that are very common on the Internet may also be produced in situations where they result in false information.

Suppose, for example, on the Internet, the word “coughing” is often followed by “… and sneezing.” Then a patient may be described falsely, by a data-oriented system, as “coughing and sneezing” in situations where they cough without sneezing. Problems of this kind are an important focus for researchers working on generative language models. Where this research will lead us is still uncertain; the best one can say is that we can try to reduce the impact of these issues, but we have no idea how to completely eliminate them.

“Large generative language models’ texts can be factually incorrect (hallucination) and leave out key information (omission).”

The above focuses on unintentional-but-unavoidable problems. There are also cases where a computer system arguably should hallucinate or omit information. An obvious example is generating marketing material, where omitting negative information about a product is expected. A more subtle example, which we have seen in our own work, is when information is potentially harmful and it is in users’ best interests to hide or distort it. For example, if a computer system is summarising information about sick babies for friends and family members, it probably should not tell an elderly grandmother with a heart condition that the baby may die, since this could trigger a heart attack.

Now that the factual accuracy of computer-generated text draws so much attention from society as a whole, the research community is starting to realize more clearly than before that we only have a limited understanding of what it means to speak the truth. In particular, we do not know how to measure the extent of (un)truthfulness in a given text.

To see what we mean, suppose two different language models answer a user’s question in two different ways, by generating two different answer texts. To compare these systems’ performance, we would need a “score card” that allowed us to objectively score the two texts as regards their factual correctness, using a variety of rubrics. Such a score card would allow us to record how often each type of error occurs in a given text, and aggregate the result into an overall truthfulness score for that text. Of particular importance would be the weighing of errors: large errors (e.g., a temperature reading that is very far from the actual temperature) should weigh more heavily than small ones, key facts should weigh more heavily than side issues, and errors that are genuinely misleading should weigh more heavily than typos that readers can correct by themselves. Essentially, the score card would work like a fair school teacher who marks pupils’ papers.

We have developed protocols for human evaluators to find factual errors in generated texts, as have other researchers, but we cannot yet create a score card as described above because we cannot assess the impact of individual errors.

What is needed, we believe, is a new strand of linguistically informed research, to tease out all the different parameters of “lying” in a manner that can inform the above-mentioned score cards, and that may one day be implemented into a reliable fact-checking protocol or algorithm. Until that time, those of us who are trying to assess the truthfulness of ChatGPT will be groping in the dark.

Featured image by Google DeepMind Via Unsplash (public domain)

OUPblog - Academic insights for the thinking world.

Real patterns and the structure of language

Real patterns and the structure of language by Ryan M. Nefdt, author of "Language, Science, and Structure: A Journey into the Philosophy of Linguistics" published by Oxford University Press

Real patterns and the structure of language

There’s been a lot of hype recently about the emergence of technologies like ChatGPT and the effects they will have on science and society. Linguists have been especially curious about what highly successful large language models (LLMs) mean for their business. Are these models unearthing the hidden structure of language itself or just marking associations for predictive purposes? 

In order to answer these sorts of questions we need to delve into the philosophy of what language is. For instance, if Language (with a big “L”) is an emergent human phenomenon arising from our communicative endeavours, i.e. a social entity, then AI is still some ways off approaching it in a meaningful way. If Chomsky, and those who follow his work, are correct that language is a modular mental system innately given to human infants and activated by miniscule amounts of external stimulus, then AI is again unlikely to be linguistic, since most of our most impressive LLMs are sucking up so many resources (both in terms of data and energy) that they are far from this childish learning target. On the third hand, if languages are just very large (possibly infinite) collections of sentences produced by applying discrete rules, then AI could be super-linguistic.

In my new book, I attempt to find a middle ground or intersection between these views. I start with an ontological picture (meaning a picture of what there is “out there”) advocated in the early nineties by the prominent philosopher and cognitive scientist, Daniel Dennett. He draws from information theory to distinguish between noise and patterns. In the noise, nothing is predictable, he says. But more often than not, we can and do find regularities in large data structures. These regularities provide us with the first steps towards pattern recognition. Another way to put this is that if you want to send a message and you need the entire series (string or bitmap) of information to do so, then it’s random. But if there’s some way to compress the information, it’s a pattern! What makes a pattern real, is whether or not it needs an observer for its existence. Dennett uses this view to make a case for “mild realism” about the mind and the position (which he calls the “intentional stance”) we use to identify minds in other humans, non-humans, and even artifacts. Basically, it’s like a theory we use to predict behaviour based on the success of our “minded” vocabulary comprising beliefs, desires, thoughts, etc. For Dennett, prediction matters theoretically!

If it’s not super clear yet, consider a barcode. At first blush, the black lines of varying length set to a background of white might seem random. But the lines (and spaces) can be set at regular intervals to reveal an underlying pattern that can be used to encode information (about the labelled entity/product). Barcodes are unique patterns, i.e. representations of the data from which more information can be drawn (by the way Nature produces these kinds of patterns too in fractal formation).  

“The methodological chasm between theoretical and computational linguistics can be surmounted.”

I adapt this idea in two ways in light of recent advances in computational linguistics and AI. The first reinterprets grammars, specifically discrete grammars of theoretical linguistics, as compression algorithms. So, in essence, a language is like a real pattern. Our grammars are collections of rules that compress these patterns. In English, noticing that a sentence is made up of a noun phrase and verb phrase is such a compression. More complex rules capture more complex patterns. Secondly, discrete rules are just a subset of continuous processes. In other words, at one level information theory looks very statistical while generative grammar looks very categorical. But the latter is a special case of the former. I show in the book how some of the foundational theorems of information theory can be translated to discrete grammar representations. So there’s no need to banish the kinds of (stochastic) processes often used and manipulated in computational linguistics, as many theoretical linguists have been wont to do in the past. 

This just means that the methodological chasm between theoretical and computational linguistics, which has often served to close the lines of communication between the fields, can be surmounted. Ontologically speaking, languages are not collections of sentences, minimal mental structures, or social entities by themselves. They are informational states taken from complex interactions of all of the above and more (like the environment). On this view, linguistics quickly emerges as a complexity science in which the tools of linguistic grammars, LLMs, and sociolinguistic observations all find a homogeneous home. Recent work on complex systems, especially in biological systems theory, has breathed new life into this interdisciplinary field of inquiry. I argue that the study of language, including the inner workings of both the human mind and ChatGPT, belong within this growing framework. 

For decades, computational and theoretical linguists have been talking different languages. The shocking syntactic successes of modern LLMs and ChatGPT have forced them into the same room. Realising that languages are real patterns emerging from biological systems gets someone to break the awkward silence…

Featured image by Google DeepMind Via Unsplash (public domain)

OUPblog - Academic insights for the thinking world.

Real patterns and the structure of language

Real patterns and the structure of language by Ryan M. Nefdt, author of "Language, Science, and Structure: A Journey into the Philosophy of Linguistics" published by Oxford University Press

Real patterns and the structure of language

There’s been a lot of hype recently about the emergence of technologies like ChatGPT and the effects they will have on science and society. Linguists have been especially curious about what highly successful large language models (LLMs) mean for their business. Are these models unearthing the hidden structure of language itself or just marking associations for predictive purposes? 

In order to answer these sorts of questions we need to delve into the philosophy of what language is. For instance, if Language (with a big “L”) is an emergent human phenomenon arising from our communicative endeavours, i.e. a social entity, then AI is still some ways off approaching it in a meaningful way. If Chomsky, and those who follow his work, are correct that language is a modular mental system innately given to human infants and activated by miniscule amounts of external stimulus, then AI is again unlikely to be linguistic, since most of our most impressive LLMs are sucking up so many resources (both in terms of data and energy) that they are far from this childish learning target. On the third hand, if languages are just very large (possibly infinite) collections of sentences produced by applying discrete rules, then AI could be super-linguistic.

In my new book, I attempt to find a middle ground or intersection between these views. I start with an ontological picture (meaning a picture of what there is “out there”) advocated in the early nineties by the prominent philosopher and cognitive scientist, Daniel Dennett. He draws from information theory to distinguish between noise and patterns. In the noise, nothing is predictable, he says. But more often than not, we can and do find regularities in large data structures. These regularities provide us with the first steps towards pattern recognition. Another way to put this is that if you want to send a message and you need the entire series (string or bitmap) of information to do so, then it’s random. But if there’s some way to compress the information, it’s a pattern! What makes a pattern real, is whether or not it needs an observer for its existence. Dennett uses this view to make a case for “mild realism” about the mind and the position (which he calls the “intentional stance”) we use to identify minds in other humans, non-humans, and even artifacts. Basically, it’s like a theory we use to predict behaviour based on the success of our “minded” vocabulary comprising beliefs, desires, thoughts, etc. For Dennett, prediction matters theoretically!

If it’s not super clear yet, consider a barcode. At first blush, the black lines of varying length set to a background of white might seem random. But the lines (and spaces) can be set at regular intervals to reveal an underlying pattern that can be used to encode information (about the labelled entity/product). Barcodes are unique patterns, i.e. representations of the data from which more information can be drawn (by the way Nature produces these kinds of patterns too in fractal formation).  

“The methodological chasm between theoretical and computational linguistics can be surmounted.”

I adapt this idea in two ways in light of recent advances in computational linguistics and AI. The first reinterprets grammars, specifically discrete grammars of theoretical linguistics, as compression algorithms. So, in essence, a language is like a real pattern. Our grammars are collections of rules that compress these patterns. In English, noticing that a sentence is made up of a noun phrase and verb phrase is such a compression. More complex rules capture more complex patterns. Secondly, discrete rules are just a subset of continuous processes. In other words, at one level information theory looks very statistical while generative grammar looks very categorical. But the latter is a special case of the former. I show in the book how some of the foundational theorems of information theory can be translated to discrete grammar representations. So there’s no need to banish the kinds of (stochastic) processes often used and manipulated in computational linguistics, as many theoretical linguists have been wont to do in the past. 

This just means that the methodological chasm between theoretical and computational linguistics, which has often served to close the lines of communication between the fields, can be surmounted. Ontologically speaking, languages are not collections of sentences, minimal mental structures, or social entities by themselves. They are informational states taken from complex interactions of all of the above and more (like the environment). On this view, linguistics quickly emerges as a complexity science in which the tools of linguistic grammars, LLMs, and sociolinguistic observations all find a homogeneous home. Recent work on complex systems, especially in biological systems theory, has breathed new life into this interdisciplinary field of inquiry. I argue that the study of language, including the inner workings of both the human mind and ChatGPT, belong within this growing framework. 

For decades, computational and theoretical linguists have been talking different languages. The shocking syntactic successes of modern LLMs and ChatGPT have forced them into the same room. Realising that languages are real patterns emerging from biological systems gets someone to break the awkward silence…

Featured image by Google DeepMind Via Unsplash (public domain)

OUPblog - Academic insights for the thinking world.

What can Large Language Models offer to linguists?

Google Deepmind. "What can Large Language Models offer to linguists?" by David J. Lobina on the OUP blog

What can Large Language Models offer to linguists?

It is fair to say that the field of linguistics is hardly ever in the news. That is not the case for language itself and all things to do with language—from word of the year announcements to countless discussions about grammar peeves, correct spelling, or writing style. This has changed somewhat recently with the proliferation of Large Language Models (LLMs), and in particular since the release of OpenAI’s ChatGPT, the best-known language model. But does the recent, impressive performance of LLMs have any repercussions for the way in which linguists carry out their work? And what is a Language Model anyway?

 At heart, all an LLM does is predict the next word given a string of words as a context —that is, it predicts the next, most likely word. This is of course not what a user experiences when dealing with language models such as ChatGPT. This is on account of the fact that ChatGPT is more properly described as a “dialogue management system”, an AI “assistant” or chatbot that translates a user’s questions (or “prompts”) into inputs that the underlying LLM can understand (the latest version of OpenAI’s LLM is a fine-tuned version of GPT-4).  

“At heart, all an LLM does is predict the next word given a string of words as a context.”

An LLM, after all, is nothing more than a mathematical model in terms of a neural network with input layers, output layers, and many deep layers in between, plus a set of trained “parameters.” As the computer scientist Murray Shanahan has put it in a recent paper, when one asks a chatbot such as ChatGPT who was the first person to walk on the moon, what the LLM is fed is something along the lines of:

Given the statistical distribution of words in the vast public corpus of (English) text, what word is most likely to follow the sequence “The first person to walk on the Moon was”?

That is, given an input such as the first person to walk on the Moon was, the LLM returns the most likely word to follow this string. How have LLMs learned to do this? As mentioned, LLMs calculate the probability of the next word given a string of words, and it does so by representing these words as vectors of values from which to calculate the probability of each word, and where sentences can also be represented as vectors of values. Since 2017, most LLMs have been using “transformers,” which allow the models to carry out matrix calculations over these vectors, and the more transformers are employed, the more accurate the predictions are—GPT-3 has some 96 layers of such transformers.

The illusion that one is having a conversation with a rational agent, for it is an illusion, after all, is the result of embedding an LLM in a larger computer system that includes background “prefixes” to coax the system into producing behaviour that feels like a conversation (the prefixes include templates of what a conversation looks like). But what the LLM itself does is generate sequences of words that are statistically likely to follow from a specific prompt.

It is through the use of prompt prefixes that LLMs can be coaxed into “performing” various tasks beyond dialoguing, such as reasoning or, according to some linguists and cognitive scientists, learn the hierarchical structures of a language (this literature is ever increasing). But the model itself remains a sequence predictor, as it does not manipulate the typical structured representations of a language directly, and it has no understanding of what a word or a sentence means—and meaning is a crucial property of language.

An LLM seems to produce sentences and text like a human does—it seems to have mastered the rules of the grammar of English—but at the same time it produces sentences based on probabilities rather on the meanings and thoughts to express, which is how a human person produces language. So, what is language so that an LLM could learn it?

“An LLM seems to produce sentences like a human does but it produces them based on probabilities rather than on meaning.”

A typical characterisation of language is as a system of communication (or, for some linguists, as a system for having thoughts), and such a system would include a vocabulary (the words of a language) and a grammar. By a “grammar,” most linguists have in mind various components, at the very least syntax, semantics, and phonetics/phonology. In fact, a classic way to describe a language in linguistics is as a system that connects sound (or in terms of other ways to produce language, such as hand gestures or signs) and meaning, the connection between sound and meaning mediated by syntax. As such, every sentence of a language is the result of all these components—phonology, semantics, and syntax—aligning with each other appropriately, and I do not know of any linguistic theory for which this is not true, regardless of differences in focus or else.

What this means for the question of what LLMs can offer linguistics, and linguists, revolves around the issue of what exactly LLMs have learned to begin with. They haven’t, as a matter of fact, learned a natural language at all, for they know nothing about phonology or meaning; what they have learned is the statistical distribution of the words of the large texts they have been fed during training, and this is a rather different matter.

As has been the case in the past with other approaches in computational linguistics and natural language processing, LLMs will certainly flourish within these subdisciplines of linguistics, but the daily work of a regular linguist is not going to change much any time soon. Some linguists do study the properties of texts, but this is not the most common undertaking in linguistics. Having said that, how about the opposite question: does a run-of-the-mill linguist have much to offer to LLMs and chatbots at all?   

Featured image: Google Deepmind via Unsplash (public domain)

OUPblog - Academic insights for the thinking world.

ChatGPT goes to court

I attended a show-cause hearing for two attorneys and their firm who submitted nonexistent citations and then entirely fictitious cases manufactured by ChatGPT to federal court, and then tried to blame the machine. “This case is Schadenfreude for any lawyer,” said the attorneys’ attorney, misusing a word as ChatGPT might. “There but for the grace of God go I…. Lawyers have always had difficulty with new technology.”

The judge, P. Kevin Castel, would have none of it. At the end of the two-hour hearing in which he meticulously and patiently questioned each of the attorneys, he said it is “not fair to pick apart people’s words,” but he noted that the actions of the lawyers were “repeatedly described as a mistake.” The mistake might have been the first submission with its nonexistent citations. But “that is the beginning of the narrative, not the end,” as again and again the attorneys failed to do their work, to follow through once the fiction was called to their attention by opposing counsel and the court, to even Google the cases ChatGPT manufactured to verify their existence, let alone to read what “gibberish” — in the judge’s description—ChatGPT fabricated. And ultimately, they failed to fully take responsibility for their own actions.

Over and over again, Steven Schwartz, the attorney who used ChatGPT to do his work, testified to the court that “I just never could imagine that ChatGPT would fabricate cases…. It never occurred to me that it would be making up cases.” He thought it was a search engine — a “super search engine.” And search engines can be trusted, yes? Technology can’t be wrong, right?

Now it’s true that one may fault some large language models’ creators for giving people the impression that generative AI is credible when we know it is not — and especially Microsoft for later connecting ChatGPT with its search engine, Bing, no doubt misleading more people. But Judge Castel’s point stands: It was the lawyer’s responsibility — to themselves, their client, the court, and truth itself — to check the machine’s work. This is not a tale of technology’s failures but of humans’, as most are.

Technology got blamed for much this day. Lawyers faulted their legal search engine, Fastcase, for not giving this personal-injury firm, accustomed to state courts, access to federal cases (a billing screwup). They blamed Microsoft Word for their cut-and-paste of a bolloxed notorization. In a lovely Gutenberg-era moment, Judge Castel questioned them about the odd mix of fonts — Times Roman and something sans serif — in the fake cases, and the lawyer blamed that, too, on computer cut-and-paste. The lawyers’ lawyer said that with ChatGPT, Schwartz “was playing with live ammo. He didn’t know because technology lied to him.” When Schwartz went back to ChatGPT to “find” the cases, “it doubled down. It kept lying to him.” It made them up out of digital ether. “The world now knows about the dangers of ChatGPT,” the lawyers’ lawyer said. “The court has done its job warning the public of these risks.” The judge interrupted: “I did not set out to do that.” For the issue here is not the machine, it is the men who used it.

The courtroom was jammed, sending some to an overflow courtroom to listen. There were some reporters there, whose presence the lawyers noted as they lamented their public humiliation. The room was also filled with young, dark-suited law students and legal interns. I hope they listened well to the judge (and I hope the journalists did, too) about the real obligations of truth.

ChatGPT is designed to tell you what you want it to say. It is a personal propaganda machine that strings together words to satisfy the ear, with no expectation that it is right. Kevin Roose of The New York Times asked ChatGPT to reveal a dark soul and he was then shocked and disturbed when it did just what he had requested. Same for attorney Schwartz. In his questioning of the lawyer, the judge noted this important nuance: Schwartz did not ask ChatGPT for explanation and case law regarding the somewhat arcane — especially to a personal-injury lawyer usually practicing in state courts — issues of bankruptcy, statutes of limitation, and international treaties in this case of an airline passenger’s knee and an errant snack cart. “You were not asking ChatGPT for an objective analysis,” the judge said. Instead, Schwartz admitted, he asked ChatGPT to give him cases that would bolster his argument. Then, when doubted about the existence of the cases by opposing counsel and judge, he went back to ChatGPT and it produced the cases for him, gibberish and all. And in a flash of apparent incredulity, when he asked ChatGPT “are the other cases you provided fake?”, it responded as he doubtless hoped: “No, the other cases I provided are real.” It instructed that they could be found on reputible legal databases such as LexisNexis and Westlaw, which Schwartz did not consult. The machine did as it was told; the lawyer did not. “It followed your command,” noted the judge. “ChatGPT was not supplementing your research. It was your research.”

Schwartz gave a choked-up apology to the court and his colleagues and his opponents, though as the judge pointedly remarked, he left out of that litany his own ill-served client. Schwartz took responsibility for using the machine to do his work but did not take responsibility for the work he did not do to verify the meaningless strings of words it spat out.

I have some empathy for Schwartz and his colleagues, for they will likely be a long-time punchline in jokes about the firm of Nebbish, Nebbish, & Luddite and the perils of technological progress. All its associates are now undergoing continuing legal education courses in the proper use of artificial intelligence (and there are lots of them already). Schwartz has the ill luck of being the hapless pioneer who came upon this new tool when it was three months in the world, and was merely the first to find a new way to screw up. His lawyers argued to the judge that he and his colleagues should not be sanctioned because they did not operate in bad faith. The judge has taken the case under advisement, but I suspect he might not agree, given their negligence to follow through when their work was doubted.

I also have some anthropomorphic sympathy for ChatGPT, as it is a wronged party in this case: wronged by the lawyers and their blame, wronged by the media and their misrepresentations, wronged by the companies — Microsoft especially — that are trying to tell users just what Schwartz wrongly assumed: that ChatGPT is a search engine that can supply facts. It can’t. It supplies credible-sounding — but not credible — language. That is what it is designed to do. That is what it does, quite amazingly. Its misuse is not its fault.

I have come to believe that journalists should stay away from ChatGPT, et al., for creating that commodity we call content. Yes, AI has long been used to produce stories from structured and limited data: sports games and financial results. That works well, for in these cases, stories are just another form of data visualization. Generative AI is something else again. It picks any word in the language to place after another word based not on facts but on probability. I have said that I do see uses for this technology in journalism: expanding literacy, helping people who are intimidated by writing and illustration to tell their own stories rather than having them extracted and exploited by journalists, for example. We should study and test this technology in our field. We should learn about what it can and cannot do with experience, rather than misrepresenting its capabilities or perils in our reporting. But we must not have it do our work for us.

Besides, the world already has more than enough content. The last thing we need is a machine that spits out yet more. What the world needs from journalism is research, reporting, service, solutions, accountability, empathy, context, history, humanity. I dare tell my journalism students who are learning to write stories that writing stories is not their job; it is merely a useful skill. Their job as journalists is to serve communities and that begins with listening and speaking with people, not machines.


Image: Lady Justice casts off her scale for the machine, by DreamStudio

The post ChatGPT goes to court appeared first on BuzzMachine.

Half-Baked Thoughts on ChatGPT and the College Essay

The Chronicle of Higher Education recently ran a piece by Owen Kichizo Terry, an undergraduate at Columbia University, on how college students are successfully using ChatGPT to produce their essays.

The more effective, and increasingly popular, strategy is to have the AI walk you through the writing process step by step. You tell the algorithm what your topic is and ask for a central claim, then have it give you an outline to argue this claim. Depending on the topic, you might even be able to have it write each paragraph the outline calls for, one by one, then rewrite them yourself to make them flow better.

As an example, I told ChatGPT, “I have to write a 6-page close reading of the Iliad. Give me some options for very specific thesis statements.” (Just about every first-year student at my university has to write a paper resembling this one.) Here is one of its suggestions: “The gods in the Iliad are not just capricious beings who interfere in human affairs for their own amusement but also mirror the moral dilemmas and conflicts that the mortals face.” It also listed nine other ideas, any one of which I would have felt comfortable arguing. Already, a major chunk of the thinking had been done for me. As any former student knows, one of the main challenges of writing an essay is just thinking through the subject matter and coming up with a strong, debatable claim. With one snap of the fingers and almost zero brain activity, I suddenly had one.

My job was now reduced to defending this claim. But ChatGPT can help here too! I asked it to outline the paper for me, and it did so in detail, providing a five-paragraph structure and instructions on how to write each one. For instance, for “Body Paragraph 1: The Gods as Moral Arbiters,” the program wrote: “Introduce the concept of the gods as moral arbiters in the Iliad. Provide examples of how the gods act as judges of human behavior, punishing or rewarding individuals based on their actions. Analyze how the gods’ judgments reflect the moral codes and values of ancient Greek society. Use specific passages from the text to support your analysis.” All that was left now was for me to follow these instructions, and perhaps modify the structure a bit where I deemed the computer’s reasoning flawed or lackluster.

The kid, who just completed their first year at Williams, confirms that this approach is already widespread at their campus.

I spent a few hours yesterday replicating the process for two classes in my rotation: one the politics of science fiction, the other on global power politics. Here are my takeaways about the current “state of play.”

First, professors who teach courses centered on “classic” literary and political texts need to adapt yesterday. We don’t expect students to make original arguments about Jane Austen or Plato; we expect them to wrestle with “enduring” issues (it’s not even clear to me what an “original” argument about Plato would look like). ChatGPT has—as does any other internet-based LLM—access to a massive database of critical commentary on such venerable texts. These conditions make the method very effective.

Second, this is also true for films, television, popular novels, and genre fiction. I ran this experiment on a few of the books that cycle on and off my “science-fiction” syllabus—including The Fifth Head of CerberusThe DispossessedThe Forever War, and Dawn—and the outcomes were pretty similar to what you’d expect from “literary” classics or political philosophy.

Third, ChatGPT does significantly less well with prompts that require putting texts into dialogue with one another. Or at least those that aren’t fixtures of 101 classes.

For example, I asked ChatGPT to help me create an essay that reads The Forever War through Carl Schmitt’s The Concept of the Political. The results were… problematic. I could’ve used them to write a great essay on how actors in The Forever War construct the Taurans as a threat in order to advance their own political interests. Which sounds great. Except that’s not actually Schmitt’s argument about the friend/enemy distinction.

ChatGPT did relatively better on “compare and contrast” essays. I used the same procedure to try to create an essay that compares The Dispossessed to The Player of Games. This is not a common juxtaposition in science-fiction scholarship or science-fiction online writing, but it’s extremely easy to the two works in conversation with one another. ChatGPT generated topics and outlines that picked up on that conversation, but in a very superficial way. It gave me what I consider “high-school starter essays,” with themes like ‘both works show how an individual can make a difference’ or ‘both works use fictional settings to criticize aspects of the real world.’ 

Now, maybe my standards are too high, but this is the level of analysis that leaves me asking “and?” Indeed, the same is true of example used in the essay: it’s very Cliff’s Notes. Now, it’s entirely possible to get “deeper” analysis via ChatGPT. You can drill down on one of the sections it offers in a sample outline; you can ask it more specific prompts. That kind of thing.

At some point, though, this starts to become a lot of work. It also requires you to actually know something about the material. 

Which leads me to my fourth reaction: I welcome some of what ChatGPT does. It consistently provides solid “five-paragraph essay” outlines. I lose track of how many times during any given semester I tell students that “I need to know what your argument is by the time I finish your introduction” and “the topic of an essay is not its argument.” ChatGPT not only does that, but it also reminds students to do that. 

In some respects, ChatGPT is just doing what I do when students me with me about their essays: helping them take very crude ideas and mold them into arguments, suggesting relevant texts to rope in, and so forth. As things currently stand, I think I do a much better job on the conceptual level, but I suspect that a “conversation” with ChatGPT might be more effective at pushing them on matters of basic organization. 

Fifth, ChatGPT still has a long way to go when it comes to the social sciences—or, at least International Relations. For essays handling generic 101 prompts it did okay. I imagine students are already easily using it to get As on short essays about, say, the difference between “balance of power” and “balance of threat” or on the relative stability of unipolar, bipolar, and multipolar systems

Perhaps they’re doing so with a bit less effort than it would take to Google the same subjects and reformulate what they find in their own words? Maybe that means they’re learning less? I’m not so sure.

The “superficiality” problem became much more intense when I asked it to provide essays on recent developments in the theory and analysis of power politics. When I asked it for suggestions for references, at least half of them were either total hallucinations or pastiches of real ones. Only about a quarter were actually appropriate, and many of these were old. Asking for more recent citations was a bust. Sometimes it simply changed the years.

I began teaching in the late 1990s and started as a full-time faculty member at Georgetown in 2002. In the intervening years, it’s becoming more and more difficult to know what to do about “outside sources” for analytical essays. 

I want my students to find and use outside articles—which now means through Google Scholar, JSTOR, and other databases. But I don’t want them to bypass class readings for (what they seem to think are) “easier” sources, especially as many of them are now much more comfortable looking at a webpage than with reading a PDF. I would also be very happy if I never saw another citation to “journals” with names like ProQuest and JSTOR.

I find that those students who do (implicitly or explicitly) bypass the readings often hand in essays with oddball interpretations of the relevant theories, material, or empirics. This makes it difficult to tell if I’m looking at the result of a foolish decision (‘hey, this website talks about this exact issue, I’ll build my essay around that’) or an effort to recycle someone else’s paper. 

The upshot is that I don’t think it’s obvious that LLMs are going to generate worse educational outcomes than we’re already seeing.

Which leads me to the sixth issue, which is where do we go from here. Needless to say, “it’s complicated.” 

The overwhelming sentiment among my colleagues is that we’re seeing an implosion of student writing skills, and that this is a bad thing. But it’s hard to know how much that matters in a world in which LLM-based applications take over a lot of everyday writing. 

I strongly suspect that poor writing skills are still a big problem. It seems likely that analytic thinking is connected to clear analytic writing—and that the relationship between the two is often both bidirectional and iterative. But if we can harness LLMs to help students understand how to clearly express ideas, then maybe that’s a net good.

Much of the chatter that I hear leans toward abandoning—or at least deemphasizing—the use of take-home essays. It means, for the vast majority of students, doing their analytic writing in a bluebook under time pressure. It’s possible that makes strong writing skills even more important, as it deprives students of the ability to get feedback on drafts and help with revisions. I’m not sure it helps to teach those skills, and it will bear even less resemblance to any writing that they do after college or graduate school than a take-home paper does.

(If that’s the direction we head in, then I suppose more school districts will need to reintroduce (or at least increase their emphasis on) instruction in longhand writing. It also has significant implications for how schools handle student accommodations; it could lead students to more aggressively pursue them in the hope of evading rules on the use of ChatGPT, which could in turn reintroduce some of the Orwellian techniques used to police exams during the height of the pandemic).

For now, one of the biggest challenges to producing essays via ChatGPT remains the “citation problem.” But given various workarounds, professors who want to prevent the illicit use of ChatGPT probably already cannot pin their hopes on finding screwy references. They’ll need to base more of their grading not just on whether a student demonstrates the ability to make a decent argument about the prompt, but on whether they demonstrate a “deeper” understanding of the logic and content of the references that they use. Professor will probably also need to mandate, or at least issue strict directions about, what sources students can use.

(To be clear, that increases the amount of effort required to grade a paper. I’m acutely aware of this problem, as I already take forever to mark up assignments. I tend to provide a lot of feedback and… let’s just say that it’s not unheard of for me to send a paper back to a student many months after the end of the class.)

We also need to ask ourselves what, exactly, is the net reduction in student learning if they read both a (correct) ChatGPT explanation of an argument and the quotations that ChatGPT extracts to support it. None of this strike me as substantively all that different from skimming an article, which we routinely tell students to do. At some level, isn’t this just another route to learning the material?

AI enthusiasts claim that it won’t be long before LLM hallucinations—especially those involving references—become a thing of the past. If that’s true, then we are also going to have to reckon with the extent that the use of general-purpose LLMs creates feedback loops that favor some sources, theories, and studies over others. We are already struggling with how algorithms, including those generated through machine-learning, shape our information environment on social-media platforms and in search engines. Google scholars’ algorithm is already affecting the citations that show up in academic papers, although here at least academics mediate the process.

Regardless, how am I going to approach ChatGPT in the classroom? I am not exactly sure. I’ve rotated back into teaching one of our introductory lecture courses, which is bluebook-centered to begin with. The other class, though, is a writing-heavy seminar. 

In both my class I do intend to at least talk about the promises and pitfalls of ChatGPT, complete with some demonstrations of how it can go wrong. In my seminar, I’m leaning toward integrating it into the process and requiring that students hand in the transcripts from their sessions. 

What do you think?

Generative AI as a Learning Technology

On last week’s episode of the Intentional Teaching podcast, I talked with educators and authors James Lang and Michelle D. Miller about ways we might rethink our assignments and courses in light of new generative AI tools like ChatGPT. Since we situated that conversation in some other technological disruptions to teaching and learning, including the internet and Wikipedia, perhaps it was inevitable that one of us drew a comparison to the advent of handheld calculators in math classes.

Jim pointed out that we don’t just hand kindergartners calculators and expect them to do anything useful. We have to teach kids numeracy skills before they start using calculators so that they know what they’re doing with the calculators. In the same way, Jim argued, we shouldn’t have first-year composition students use ChatGPT to help them outline their essays since those students need to develop their own pre-writing and outlining skills. The chatbot might produce a sensible first draft, but it would also short-circuit the student’s learning. ChatGPT might be more appropriate for use by more experienced writers, who can use the tool to save time, just as a more experienced math student would use a calculator for efficiency.

I generally agree with this analysis, but I had a different kind of experience using calculators in school. When I learned calculus my senior year, we used graphing calculators regularly both in and out of class. My memories are admittedly a little hazy now, but I believe that there was something of a symbiotic relationship between learning the concepts of calculus and learning how to use a calculator to manage the calculations of calculus. For instance, we might try to come up with a polynomial that had certain roots or certain slope properties, then graph our function using the calculator to see if we were correct. The tool provided feedback on our math practice, while we also got better at using the tool.

Four people in the woods looking at distant birds through binoculars and cameras
That’s me there with the telephoto lens on a bird walk.

Here’s another analogy: photography. I’m an amateur photographer. (Actually, I once got paid $75 to photograph an event, so technically I’m a professional photographer.) When I was learning photography, there was a lot of conceptual learning about light and depth of field and composition but also learning how to use my digital camera, what all the knobs and buttons did. As I experimented with taking pictures, my use of the camera helped sharpen my understanding of the relevant concepts of photography. And my better understanding of those concepts in turn informed the ways I used the knobs and buttons on the camera to take better photos.

Might AI tools like ChatGPT serve a similar role, at least for students with a certain level of foundational writing skills? It’s already quite easy to ask ChatGPT (or Bing or one of the other chatbots powered by large language models) to draft a piece of writing for you, and then to give it feedback or corrections to make. For an example, check out this post by John Warner in which he coaches ChatGPT to write a short story and then make it a better story. John is already an accomplished writer, but might a more novice writer use prompt refinement in this way to develop their own writing skills, much like I would use a bunch of different settings on my camera to take the same photo so I could better learn what those settings do?

All metaphors are wrong (to quote my colleague Nancy Chick), and none of the analogies I’ve laid out here are perfect. But I think there is some value in thinking about ChatGPT, etc., as similar to technologies like digital cameras or graphing calculators that we can use to learn skills and sharpen our craft as we learn to manipulate the tools.

Why You Should Embrace ChatGPT AI in the Classroom

by Caroline Chun

It’s no longer news that one of the first professional sectors threatened by the rapid adoption of ChatGPT and generative AI is education – universities and colleges around the country convened emergency meetings to discuss what to do about the risk of students using AI to cheat on their work. There’s another side to that evolving AI story. Recent research from professors at the University of Pennsylvania’s Wharton School, New York University and Princeton suggests that educators should be just as worried about their own jobs.

In an analysis of professions “most exposed” to the latest advances in large language models like ChatGPT, eight of the top 10 are teaching positions.

“When we ran our analysis, I was surprised to find that educational occupations come out close to the top in many cases,” said Robert Seamans, co-author of the new research study and professor at NYU.

Post-secondary teachers in English language and literature, foreign language, and history topped the list among educators. 

Jobs most ‘exposed’ to generative A.I.

The table shows the jobs that are most likely to encounter generative A.I. as part of their responsibilities.

1.Telemarketers 
2.English language and literature teachers
3.Foreign language and literature teachers 
4.History teachers
5.Law teachers
6.Philosophy and religion teachers 
7.Sociology teachers
8.Political science teachers
9.Criminal justice and law enforcement teachers
10.Sociologists

Note: All teaching positions listed are at post-secondary institutions.Source: How will Language Modelers like ChatGPT Affect Occupations and Industries? Authors: Ed Felten (Princeton), Manav Raj (University of Pennsylvania) and Robert Seamans (New York University)

While evidence has been growing in recent years that work within highly skilled professions — for example, lawyers — may be influenced by AI, typically the jobs expected to be most affected by technology are routine or rote jobs, while highly-skilled labor is considered more protected. 

But this study finds the opposite to be the case.

“Highly-skilled jobs may be more affected than others,” said Manav Raj, co-author and professor at the University of Pennsylvania’s Wharton School.

But affected jobs – or as the study officially describes it, jobs most “exposed to AI”  – does not necessarily mean the human positions will be replaced.  

“ChatGPT can be used to help professors generate syllabi or to recommend readings that are relevant to a given topic,” said Raj, who is not currently concerned about the fear of replacement. It can also design educational slides and in-class exercises. And for topics that are very dense, “ChatGPT can even help educators translate some of those lessons or takeaways in simpler language,” he said.

Education technology company Udemy has been selling language learning modules made with ChatGPT to help language teachers design their courses. 

Duolingo, the popular online language learning company, is relying on AI technology to power its Duolingo English Test (DET), an English proficiency exam available online, on demand. The test utilizes ChatGPT to generate text passages for reading comprehension and AI for supporting human proctors in spotting suspicious test-taking behavior.

It is also working with teachers to generate lesson content and speed up the process and scale of adding advanced materials to the platform. “Since not everyone in the world has equal access to great teachers and favorable learning conditions, AI gives us the best chance to scale quality education to everyone who needs it,” said Klinton Bicknell, Duolingo’s head of AI.

What college professors are thinking and doing

Some professors are wary of ChatGPT and its capabilities. 

Kristina Reardon, an English professor at Amherst College, says there is a line to draw when using ChatGPT as a professor and considering ChatGPT’s role of co-authorship in writing. 

“No matter how good Chat GPT gets, I believe we can gain a lot from learning to pre-write, draft, revise, edit, etc. Writing is a process, and it’s an iterative one, and one that helps us think through ideas,” she said.

Many universities have sent out guidance to professors for use of ChatGPT and how they can augment their students’ experiences while still maintaining academic integrity. 

Princeton advises professors to be explicit in the uses of ChatGPT in their syllabi; use it to enhance smaller group discussions, and use it as a tool to compare students’ own drafts of essays with a ChatGPT version. 

At Cornell University, regardless of general university guidelines, every instructor will be free to make their own decision on what works best for their area of teaching, says Morten Christiansen, psychology professor at the school.

Many professors are starting to use ChatGPT in the classroom.

Laurent Dubreuil, professor in French and comparative literature at Cornell, is currently having his students assess the boundaries of academic freedom and censorship, as the most recent versions of ChatGPT are “now coming with set parameters about what is socially and politically acceptable to say — and what should not be uttered.”

Christiansen says ChatGPT can help level the playing field among students. “It can be used as a personal tutor to help them, and there’s an opportunity for students to evaluate what ChatGPT produces,” he said.

In fact, the technology’s imperfections are an opportunity to teach and learn in new ways, honing students’ critical analysis skills by prompting them to ask ChatGPT specific questions related to course content and critique the answers given back to them. Many current generative AI language models produce what AI experts have deemed “hallucinations.”

“ChatGPT will make things up and it will look like it is really confident in what it is saying, including adding references that don’t actually exist,” Christiansen said. 

Ethan Mollick, entrepreneurship professor at Wharton, who has become an evangelist within the education world for generative AI experimentation, expects his students to use ChatGPT in every document they produce, whether this is for marketing materials, graphics, blog posts or even new working apps.

“I think we have to realize it’s part of our lives, and we have to figure out how to work with that,” he said. 

Mollick does not think exposure means eventual replacement. 

“We have to recognize that we need to change how we approach things and embrace this new technology,” he said. “We’ve adapted to other technological changes, and I think this one we will adapt to as well.”

ChatGPT: Post-ASU+GSV Reflections on Generative AI

The one question I heard over and over again in hallway conversations at ASU+GSV was “Do you think there will be a single presentation that doesn’t mention ChatGPT, Large Langauge Models (LLMs), and generative AI?”

Nobody I met said “yes.” AI seemed to be the only thing anybody talked about.

And yet the discourse sounded a little bit like GPT-2 trying to explain the uses, strengths, and limitations of GPT-5. It was filled with a lot of empty words, peppered in equal parts with occasional startling insights and ghastly hallucinations. 

That lack of clarity is not a reflection of the conference or its attendees. Rather, it underscores the magnitude of the change that is only beginning. Generative AI is at least as revolutionary as the graphical user interface, the personal computer, the touch screen, or even the internet. Of course we don’t understand the ramifications yet.

Still, lessons learned from GPT-2 enabled the creation of GPT-3 and so on. So today, I reflect on some of the lessons I am learning so far regarding generative AI, particularly in EdTech.

Generative AI will destroy so we can create

Most conversations on the topic of generative AI have the words “ChatGPT” and “obsolete” in the same sentence. “ChatGPT will make writing obsolete.” “ChatGPT will make programmers obsolete.” “ChatGPT will make education obsolete.” “ChatGPT will make thinking and humans obsolete.” While some of these predictions will be wrong, the common theme behind them is right. Generative AI is a commoditizing force. It is a tsunami of creative destruction.

Consider the textbook industry. As long-time e-Literate readers know, I’ve been thinking a lot about how its story will end. Because of its unusual economic moats, it is one of the last media product categories to be decimated or disrupted by the internet. But those moats have been drained one by one. Its army of sales reps physically knocking on campus doors? Gone. The value of those expensive print production and distribution capabilities? Gone. Brand reputation? Long gone. 

Just a few days ago, Cengage announced a $500 million cash infusion from its private equity owner:

“This investment is a strong affirmation of our performance and strategy by an investor who has deep knowledge of our industry and a track record of value creation,” said Michael E. Hansen, CEO, Cengage Group. “By replacing debt with equity capital from Apollo Funds, we are meaningfully reducing outstanding debt giving us optionality to invest in our portfolio of growing businesses.”Cengage Group Announces $500 Million Investment From Apollo Funds (prnewswire.com)

That’s PR-speak for “our private equity owners decided it would be better to give us yet another cash infusion than to let us go through yet another bankruptcy.”

What will happen to this tottering industry when professors, perhaps with the help of on-campus learning designers, can use an LLM to spit out their own textbooks tuned to the way they teach? What will happen when the big online universities decide they want to produce their own content that’s aligned with their competencies and is tied to assessments that they can track and tune themselves? 

Don’t be fooled by the LLM hallucination fear. The technology doesn’t need to (and shouldn’t) produce a perfect, finished draft with zero human supervision. It just needs to lower the work required from expert humans enough that producing a finished, student-safe curricular product will be worth the effort. 

How hard would it be for LLM-powered individual authors to replace the textbook industry? A recent contest challenged AI researchers to develop systems that match human judgment in scoring free text short-answer questions. “The winners were identified based on the accuracy of automated scores compared to human agreement and lack of bias observed in their predictions.” Six entrants met the challenge. All six were built on LLMs. 

This is a harder test than generating anything in a typical textbook or courseware product today. 

The textbook industry has received ongoing investment from private equity because of its slow rate of decay. Publishers threw off enough cash that the slum lords who owned them could milk their thirty-year-old platforms, twenty-year-old textbook franchises, and $75 PDFs for cash. As the Cengage announcement shows, that model is already starting to break down. 

How long will it take before generative AI causes what’s left of this industry to visibly and rapidly disintegrate? I predict 24 months at most. 

EdTech, like many industries, is filled with old product categories and business models that are like blighted city blocks of condemned buildings. They need to be torn down before something better can be built in their place. We will get a better sense of the new models that will rise as we see old models fall. Generative AI is a wrecking ball.

“Chat” is conversation

I pay $20/month for a subscription to ChatGPT Plus. I don’t just play with it. I use it as a tool every day. And I don’t treat it like a magic information answer machine. If you want a better version of a search engine, use Microsoft Bing Chat. To get real value out of ChatGPT, you have to treat it less like an all-knowing Oracle and more like a colleague. It knows some things that you don’t and vice versa. It’s smart but can be wrong. If you disagree with it or don’t understand its reasoning, you can challenge it or ask follow-up questions. Within limits, it is capable of “rethinking” its answer. And it can participate in a sustained conversation that leads somewhere. 

For example, I wanted to learn how to tune an LLM so that it can generate high-quality rubrics by training it on a set of human-created rubrics. The first piece I needed to learn is how LLMs are tuned. What kind of magic computer programming incantations do I need to get somebody to write for me?

As it turns out, the answer is none, at least generally speaking. LLMs are tuned using plain English. You give it multiple pairs of input that a user might type into the text box and desired output from the machine. For example, suppose you want to tune the LLM to provide cooking recipes. Your tuning “program” might look something like this:

  • Input: How do I make scrambled eggs?
  • Output: [Recipe]

Obviously, the recipe output example you give would have a number of structured components, like an ingredient list and steps for cooking. Given enough examples, the LLM begins to identify patterns. You teach it how to respond to a type of question or a request by showing it examples of good answers. 

I know this because ChatGPT explained it to me. It also explained that the GPT-4 model can’t be tuned this way yet but other LLMs, including earlier versions of GPT, can. With a little more conversation, I was able to learn how LLMs are tuned, which ones are tunable, and that I might even have the “programming” skills necessary to tune one of these beasts myself. 

It’s a thrilling discovery for me. For each rubric, I can write the input. I can describe the kind of evaluation I want, including the important details I want it to address. I, Michael Feldstein, am capable of writing half the “program” needed to tune the algorithm for one of the most advanced AI programs on the planet. 

But the output I want, a rubric, is usually expressed as a table. LLMs speak English. They can create tables but have to express their meaning in English and then translate that meaning into table format. Much like I do. This is a funny sort of conundrum. Normally, I can express what I want in English but don’t know how to get it into another format. This time I have to figure out how to express what the table means in English sentences.

I have a conversation with ChatGPT about how to do this. First I ask it about what the finished product would look like. It explains how to express a table in plain English, using a rubric as an example. 

OK! That makes sense. Once it gives me the example, I get it. Since I am a human and understand my goal while ChatGPT is just a language model—as it likes to remind me—I can see ways to fine-tune what it’s given me. But it taught me the basic concept.

Now how do I convert many rubric tables? I don’t want to manually write all those sentences to describe the table columns, rows, and cells. I happen to know that, if I can get the table in a spreadsheet (as opposed to a word-processing document), I can export it as a CSV. Maybe that would help. I ask ChatGPT, “Could a computer program create those sentences from a CSV export?” 

“Why yes! As long as the table has headings for each column, a program could generate these sentences from a CSV.” 

“Could you write a program for me that does this?” 

“Why, yes! If you give me the headings, I can write a Python program for you.” 

It warns me that a human computer programmer should check its work. It always says that. 

In this particular case, the program is simple enough that I’m not sure I would need that help. It also tells me, when I ask, that it can write a program that would import my examples into the GPT-3 model in bulk. And it again warns me that a human programmer should check its work. 

ChatGPT taught me how I can tune an LLM to generate rubrics. By myself. Later, we discussed how to test and further improve the model, depending on how many rubrics I have as examples. How good would its results be? I don’t know yet. But I want to find out. 

Don’t you?

LLMs won’t replace the need for all knowledge and skills

Notice that I needed both knowledge and skills in order to get what I needed from ChatGPT. I needed to understand rubrics, what a good one looks like, and how to describe the purpose of one. I needed to think through the problem of the table format far enough that I could ask the right questions. And I had to clarify several aspects of the goal and the needs throughout the conversation in order to get the answers I wanted. ChatGPT’s usefulness is shaped and limited by my capabilities and limitations as its operator. 

This dynamic became more apparent when I explored with ChatGPT how to generate a courseware module. While this task may sound straightforward, it has several kinds of complexity to it. First, well-designed courseware modules have many interrelated parts from a learning design perspective. Learning objectives are related to assessments and specific content. Within even as simple an assessment as a multiple-choice question (MCQ), there are many interrelated parts. There’s the “stem,” or the question. There are “distractors,” which are wrong answers. Each answer may have feedback that is written in a certain way to support a pedagogical purpose. Each question may also have several successive hints, each of which is written in a particular way to support a particular pedagogical purpose. Getting these relationships—these semantic relationships—right will result in more effective teaching content. It will also contain structure that supports better learning analytics. 

Importantly, many of these pedagogical concepts will be useful for generating a variety of different learning experiences. The relationships I’m trying to teach the LLM happen to come from courseware. But many of these learning design elements are necessary to design simulations and other types of learning experiences too. I’m not just teaching the LLM about courseware. I’m teaching it about teaching. 

Anyway, feeding whole modules into an LLM as output examples wouldn’t guarantee that the software would catch all of these subtleties and relationships. ChatGPT didn’t know about some of the complexities involved in the task I want to accomplish. I had to explain them to it. Once it “understood,” we were able to have a conversation about the problem. Together, we came up with three different ways to slice and dice content examples into input-output pairs. In order to train the system to catch as many of the relationships and subtleties as possible, it would be best to feed the same content to the LLM all three ways.

Most publicly available courseware modules are not consistently and explicitly designed in ways that would make this kind of slicing and dicing easy (or even possible). Luckily, I happen to know where can get my hands on some high-quality modules that are marked up in XML. Since I know just a little bit about XML and how these modules use it, I was able to have a conversation with ChatGPT about which XML to strip out, the pros and cons of converting the rest into English versus leaving them as XML, how to use the XML Document Type Definition (DTD) to teach the software about some of the explicit and implicit relationships among the module parts, and how to write the software that would do the work of converting the modules into input-output pairs. 

By the end of the exploratory chat, it was clear that the work I want to accomplish requires more software programming skill than I have, even with ChatGPT’s help. But now I can estimate how much time I need from a programmer. I also know the level of skill the programmer needs. So I can estimate the cost of getting the work done. 

To get this result, I had to draw on considerable prior knowledge. More importantly, I had to draw on significant language and critical thinking skills. 

Anyone who ever said that a philosophy degree like mine isn’t practical can eat my dust. Socrates was a prompt engineer. Most Western philosophers engage in some form of chain-of-thought prompting as a way of structuring their arguments. 

Skills and knowledge aren’t dead. Writing and thinking skills most certainly aren’t. Far from it. If you doubt me, ask ChatGPT, “How might teaching students about Socrates’ philosophy and method help them learn to become better prompt engineers?” See what it has to say. 

(For this question, I used the GPT-4 setting that’s available on ChatGPT Plus.)

Assessments aren’t dead either

Think about how either of the projects I described above could be scaffolded as a project-based learning assignment. Students could have access to the same tools I had: an LLM like ChatGPT and an LLM-enhanced search tool like Bing Chat. The catch is that they’d have to use the ones provided for them by the school. In other words, they’d have to show their work. If you add a discussion forum and a few relevant tutorials around it, you’d have a really interesting learning experience. 

This could work for writing too. My next personal project with ChatGPT is to turn an analysis paper I wrote for a client into a white paper (with their blessing, of course). I’ve already done the hard work. The analysis is mine. The argument structure and language style are mine. But I’ve been struggling with writer’s block. I’m going to try using ChatGPT to help me restructure it into the format I want and add some context for an external audience.

Remember my earlier point about generative AI being a commoditizing force? It will absolutely commoditize generic writing. I’m OK with that, just as I’m OK with students using calculators in math and physics once they understand the math that the calculator is performing for them. 

Students need to learn how to write generic prose for a simple reason. If they want to express themselves in extraordinary ways, whether through clever prompt engineering or beautiful art, they need to understand mechanics. The basics of generic writing are building blocks. The more subtle mechanics are part of the value that human writers can add to avoid being commoditized by generative AI. The differences between a comma, a semicolon, and an em-dash in expression are the kinds of fine-grained choices that expressive writers make. As are long sentences versus short ones, decisions about when and how often to use adjectives, choices between similar but not identical words, breaking paragraphs at the right place for clarity and emphasis, and so on. 

For example, while I would use an LLM to help me convert a piece I’ve already written into a white paper, I can’t see myself using it to write a new blog post. The value in e-Literate lies in my ability to communicate novel ideas with precision and clarity. While I have no doubt that an LLM could imitate my sentence structures, I can’t see a way that it could offer me a shortcut for the kind of expressive thought work at the core of my professional craft.

If we can harness LLMs to help students learn how to write…um…prosaic prose, then they can start using their LLM “calculators” in their communications “physics” classes. They can focus on their clarity of thought and truly excellent communication. We rarely get to teach this level of expressive excellence. Now maybe we can do it on a broader basis. 

In their current state of evolution, LLMs are like 3D printers for knowledge work. They shift the human labor from execution to design. From making to creating. From knowing more answers to asking better questions. 

We read countless stories about the threat of destruction to the labor force partly because our economy has needed the white-collar equivalent of early 20th-Century assembly line workers. People working full-time jobs writing tweets. Or updates of the same report. Or HR manuals. Therefore our education system is designed to train people for that work. 

We assume that masses of people will become useless, as will education, because we have trouble imagining an education system that teaches people—all people from all socio-economic strata—to become better thinkers rather than simply better knowers and doers. 

But I believe we can do it. The hard part is the imagining. We haven’t been trained at it. Maybe our kids will learn to be better at it than we are. If we teach them differently from how we were taught. 

Likely short-term evolution of the technology

Those of us who are not immersed in AI—including me—have been astonished at the rapid pace of change. I won’t pretend that I can see around corners. But certain short-term trends are already discernable to non-experts like me who are paying closer attention than we were two months ago. 

First, generative AI models are already proliferating and showing hints of coming commoditization around the edges. We’ve been given the impression that these programs will always be so big and so expensive to run that only giant cloud companies will come to the table with new models. That the battle will be OpenAI/Microsoft versus Google. GPT-4 is rumored to have over a trillion nodes. That large of a model takes a lot of horsepower to build, train and run. 

But researchers are already coming up with clever techniques to get impressive performance out of much smaller models. For example, Vicuña, a model developed by researchers at a few universities, is about 90% as good as GPT-4 by at least one test and has only 12 billion parameters. To put that in perspective, Vicuña can run on a decent laptop. The whole thing. Tt cost $300 to train (as opposed to the billions of dollars that have gone into ChatGPT and Google Bard). Vicuña is an early (though imperfect) example of the coming wave. Another LLM seems to pop up practically every week with new claims about being faster, smaller, smarter, cheaper, and more accurate. 

A similar phenomenon is happening with image generation. Apple has quickly moved to provide software support for optimizing the open-source Stable Diffusion model on its hardware. You can now run an image generator program on your Macbook with decent performance. I’ve read speculation that the company will follow up with hardware acceleration on the next generation of its Apple Silicon microchips.

“Socrates typing on a laptop” as interpreted by Stable Diffusion

These models will not be equally good at all things. The corporate giants will continue to innovate and likely surprise us with new capabilities. Meanwhile, the smaller, cheaper, and open-source alternatives will be more than adequate for many tasks. Google has coined a lovely phrase: “model garden.” In the near term, there will be no one model to rule them all or even a duopoly of models. Instead, we will have many models, each of which is best suited for different purposes. 

The kinds of educational use cases I described earlier in this post are relatively simple. It’s possible that we’ll see improvements in the ability to generate those types of learning content over the next 12 to 24 months, after which we may hit a point of diminishing returns. We may be running our education LLMs locally on our laptops (or even our phones) without having to rely on a big cloud provider running an expensive (and carbon-intensive) model. 

One of the biggest obstacles to this growing diversity is not technological. It’s the training data. Questions regarding the use of copyrighted content to train these models are unresolved. Infringement lawsuits are popping up. It may turn out that the major short-term challenge to getting better LLMs in education may be access to reliable, well-structured training content that is unencumbered by copyright issues. 

So much to think about…

I find myself babbling a bit in this post. This trend has many, many angles to think about. For example. I’ve skipped over the plagiarism issue because so many articles have been written about it already. I’ve only touched lightly on the hallucination problem. To me, these are temporary obsessions that arise out of our struggle to understand what this technology is good for and how we will work and play and think and create in the future. 

One of the fun parts about this moment is watching so many minds at work on the possibilities, including ideas that are bubbling up from classroom educators and aren’t getting a lot of attention. For a fun sampling of that creativity, check out The ABCs of ChatGPT for Learning by Devan Walton. 

Do yourself a favor. Explore. Immerse yourself in it. We’ve landed on a new planet. Yes, we face dangers, some of which are unknown. Still. A new planet. And we’re on it.

Strap on your helmet and go.

The post ChatGPT: Post-ASU+GSV Reflections on Generative AI appeared first on e-Literate.

The AI-Immune Assignment Challenge

AutomatED, a guide for professors about AI and related technology run by philosophy PhD Graham Clay (mentioned in the Heap of Links last month), is running a challenge to professors to submit assignments that they believe are immune to effective cheating by use of large language models.

Clay, who has explored the the AI-cheating problem in some articles at AutomatED, believes that most professors don’t grasp its severity. He recounts some feedback he received from a professor who had read about the problem:

They told me that their solution is to create assignments where students work on successive/iterative drafts, improving each one on the basis of novel instructor feedback.

Iterative drafts seem like a nice solution, at least for those fields where the core assignments are written work like papers. After all, working one-on-one with students in a tutorial setting to build relationships and give them personalized feedback is a proven way to spark strong growth.

The problem, though, is that if the student writes the first draft at home — or, more generally, unsupervised on their computer — then they could use AI tools to plagiarize it. And they could use AI tools to plagiarize the later drafts, too.

When I asserted to my internet interlocutor that they would have to make the drafting process AI-immune, they responded as follows…: Using AI to create iterative drafts would be “a lot of extra work for the students, so I don’t think it’s very likely. And even if they do that, at least they would need to learn to input the suggested changes and concepts like genre, style, organisation, and levels of revision.”…

In my view, this is a perfect example of a professor not grasping the depth of the AI plagiarism problem.

The student just needs to tell the AI tool that their first draft — which they provide to the AI tool, whether the tool created the draft or not — was met with response X from the professor.

In other words, they can give the AI tool all of the information an honest student would have, were they to be working on their second draft. The AI tool can take their description of X, along with their first draft, and create a new draft based on the first that is sensitive to X.

Not much work is required of the student, and they certainly do not need to learn how to input the suggested changes or about the relevant concepts. After all, the AI tools have been trained on countless resources concerning these very concepts and how to create text responsive to them.

This exchange indicates to me that the professor simply has not engaged with recent iterations of generative AI tools with any seriousness.

The challenge asks professors to submit assignments, from which AutomatED will select five to be completed both by LLMs like ChatGPT and by humans. The assignments will be anonymized and then graded by the professor. Check out the details here.

 

How to use generative AI creatively in Higher Education

By: Taster
Generative AI presents clear implications for teaching and learning in higher education. Drawing on their experience as early adopters of ChatGPT and DALL.E2 for teaching and learning, Bert Verhoeven and Vishal Rana present four ways they can be used to promote creativity and engagement from students. The emergence of generative AI and the release of … Continued

Ableism and ChatGPT: Why People Fear It Versus Why They Should Fear It

Philosophers have been discouraging the use of ChatGPT and sharing ideas about how to make it harder for students to use this software to “cheat.” A recent post on Daily Nous represents the mainstream perspective. Such critiques fail to engage with crip theory, which brings to light ChatGPT’s potential to both assist and, in the […]

ChatGPT gets “eyes and ears” with plugins that can interface AI with the world

An illustration of an eyeball

Enlarge (credit: Aurich Lawson | Getty Images)

On Thursday, OpenAI announced a plugin system for its ChatGPT AI assistant. The plugins give ChatGPT the ability to interact with the wider world through the Internet, including booking flights, ordering groceries, browsing the web, and more. Plugins are bits of code that tell ChatGPT how to use an external resource on the Internet.

Basically, if a developer wants to give ChatGPT the ability to access any network service (for example: "looking up current stock prices") or perform any task controlled by a network service (for example: "ordering pizza through the Internet"), it is now possible, provided it doesn't go against OpenAI's rules.

Conventionally, most large language models (LLM) like ChatGPT have been constrained in a bubble, so to speak, only able to interact with the world through text conversations with a user. As OpenAI writes in its introductory blog post on ChatGPT plugins, "The only thing language models can do out-of-the-box is emit text."

Read 18 remaining paragraphs | Comments

AI makes plagiarism harder to detect, argue academics – in paper written by chatbot

Lecturers say programs capable of writing competent student coursework threaten academic integrity

An academic paper entitled Chatting and Cheating: Ensuring Academic Integrity in the Era of ChatGPT was published this month in an education journal, describing how artificial intelligence (AI) tools “raise a number of challenges and concerns, particularly in relation to academic honesty and plagiarism”.

What readers – and indeed the peer reviewers who cleared it for publication – did not know was that the paper itself had been written by the controversial AI chatbot ChatGPT.

Continue reading...
❌