
Over his decades-long career as a late-night comedy TV writer, Joe Toplyn has crafted jokes for the likes of David Letterman and Jay Leno. He has instructed adult-level courses on comedy writing and taught hundreds of students how to script original comedy. And he has analyzed hundreds of jokes to understand their patterns and what makes them funny, and even wrote a book on the topic. So when generative AI came along, Toplyn wondered: Could he combine the linguistic skills of large language models, or LLMs, with the lessons he taught his human students to make a joke-writing machine?
The result was a wisecracking AI tool named Witscript — a web app where, for $5.99 a month, users can feed in prompts such as news headlines and image descriptions and get back jokes, funny captions, and other forms of clever wordplay.
Last year, Toplyn and Witscript competed in a laugh off. The two spent three days writing jokes based on eight evergreen news topics. Toplyn picked his best jokes — and selected Witscript’s best material as well — then comedian Mike Perkins delivered the lines to live audiences in North Hollywood. Half the jokes in each set were written by Toplyn, the other half by Witscript.
Together with Ori Amir, a stand-up comic and neuroscientist at Pomona College in California, Toplyn recorded the performances and measured the length and loudness of the laughter each joke elicited. Toplyn and Amir found that, by those measures, the AI and human-written jokes were equally funny, a result they presented in January at the 1st Workshop on Computational Humor.
Until recently, jokes were thought to be beyond the reach of AI. They were considered so complex, that it was thought an AI model would need to have “all the thinking ability of a typical human” to recognize or produce them, Toplyn said. That’s no longer true.
With advances in LLMs, and tools like Witscript that build on those LLMs, laughter is no longer seen as a final frontier. Although humor includes a spectrum of expression far beyond words alone — a strategic silence or well-timed eyebrow raise can elicit laughs just as well as words — “it turns out that generating at least some kinds of humor is easier for AI than driving a car safely,” Amir said.
Jokes were considered so complex, that it was thought an AI model would need to have “all the thinking ability of a typical human” to recognize or produce them.
But experts caution that there’s a vast gulf between making people laugh and truly mastering the nuances of human humor. Humor can be subtle, said Christian F. Hempelmann, a computational linguistics researcher at East Texas A&M University. It calls for understanding social norms — and knowing when to flout them. And it serves powerful social functions: Funny takes can be wielded to move past embarrassing moments, deliver a covert insult, or take the next step in a relationship. And at a moment when AI chatbots are increasingly being deployed as therapists, assistants, and companions, many experts think it will be essential for the models to understand and respond appropriately to subtler forms of humor, such as sarcasm, irreverence, and flippancy. If a person were to respond to a reminder about a colonoscopy appointment by writing, “Ugh, kill me now,” an AI assistant must know not to take those words literally.
There is a fundamental divide between crafting a clever punchline and deploying humor as humans do, according to Tristan Miller, a specialist in computational linguistics at the University of Manitoba — in part because current machines don’t embody the full breadth of ways and reasons that people use humor. Still, achieving humor is important for LLMs if they are to use language in all the ways that humans do, Miller said, because humor in all its forms is “one of the most human things about language and communication.”
Laughter has intrigued scientists for centuries. Philosophers including Plato and Descartes largely dismissed laughter, and suggested people used humor primarily to establish superiority or in-group status, by making jokes at others’ expense. Neurologist Herbert Spencer and others theorized that laughter was a sort of nervous relief in response to perceiving something to be inappropriate or incongruous. Recently, experts have sought to understand what it is about certain linguistic patterns that cause them to elicit laughs. A prevailing theory, said Miller, is that most jokes involve an interplay of ambiguity, incongruity, and a sudden resolution of that incongruity to tickle neuronal connections.
Consider this: Two fish are in a tank. One says to the other: “You man the guns, I’ll drive.” As data scientist Thomas Winters wrote in a 2021 article about computational humor, this joke works because the ambiguity of the word “tank” and the incongruity of gun-toting fish elicit surprise — and a person laughs when the wordplay clicks into place in their brain. The joke represents a delicate balance between words that create opposing mental images: the sensible image of fish in a tank, and the ridiculous image of fish in a tank. That equipoise is challenging for algorithms to learn, said computational humor researcher Julia Rayz of Purdue University.
Lately, however, LLMs have been rising to the challenge. Last year, social psychology researcher Drew Gorenz of the University of Southern California and his colleagues found that ChatGPT3.5, after being prompted with 50 headlines from The Onion, could write headlines in the magazine’s satirical style well enough to rival the publication’s human-generated content. More than 200 readers rated the AI’s work comparable to the magazine’s original headlines.
“It turns out that generating at least some kinds of humor is easier for AI than driving a car safely.”
Witscript refines this process even further. Toplyn, its creator, spent years breaking comedy down into a science. He analyzed hundreds of jokes, distilling the joke-writing process into a handful of straightforward algorithms. Skilled comedians often execute these steps automatically: They can come up with a setup and punchline for a joke, for instance, and then create a middle that bridges the material.
To create a system that could perform the same task, Toplyn combined his joke-writing algorithms with a large language model. By giving Witscript logical rules and structured ways to organize information, he allowed the system to take a topic sentence entered by a human user and use it to generate original replies. As he describes it, this inner framework gives Witscript the ability to pick up on keywords and to mix and substitute syllables and words when delivering its jokes. In an exchange shared on X this June, Toplyn fed Witscript the prompt, “Christie’s is auctioning off a rare 10-carat pink diamond that was once owned by Marie Antoinette,” to which the app replied, “It’s got the perfect cut – just like her head.” The joke required both wordplay and knowledge of the queen’s fate.
Toplyn is among a number of researchers who believe that a more humorous AI could produce real-world benefits. As the loneliness epidemic grows, and people seek companionships with AI, they might feel more at ease with virtual assistants or robotic companions who can tell an occasional joke, “the sort of jokes that your friend might toss into a conversation,” Toplyn said.
Hints of humor may also offer professional boosts: One recent study reported that “humor-bragging” improved people’s success with job interviews and entrepreneurs’ success with securing funding for projects. Already, AI tools can help users scatter wordplay and puns in their writing to defuse tension or craft witty emails and social media captions, said USC’s Gorenz. When he was planning a wine and cheese event, Gorenz used ChatGPT to come up with punny email signoffs such as “Brie in touch.”
“Those small little touches can add a lot to whatever you’re doing,” he said.
Still, humor composed by AI can be unpredictable in tone and quality, and fails to see either the broader or personal context, Rayz said. Just as algorithms can perpetuate biases in health care and financial decisions, AI-created jokes can promote racist, sexist, and other harmful stereotypes.
Roger Saumure, a doctoral student studying marketing at the University of Pennsylvania’s Wharton School, noticed that when he asked LLMs such as ChatGPT, which interacts with the image-creating tool Dall-E, to make cartoons funnier, it would introduce “very odd and stereotypical” changes, such as replacing an average-sized man with an obese man wearing oversized glasses.
Saumure and his doctoral adviser looked closer. They prompted ChatGPT to produce pictures of people reading, doing laundry, or engaging in other common activities, and then prompted it to make the images funnier. The results surprised them: Representation of “gender and racial minorities decreased significantly,” Saumure said, whereas the researchers observed a dramatic increase in the representation of people who “are higher in body weight, older adults, and visually impaired individuals.” Saumure said this may reflect an overcorrection for bias against certain groups and an undercorrection for others.
Another sobering fact of AI-generated comedy is that much of it is, well, mediocre.
The data underscore the importance of not leaving algorithmic humor unchecked, Saumure said. But Toplyn said they’re also unsurprising. The problematic results don’t reveal flaws in the algorithm, he said, rather they highlight the high number of “horrible people in our society who think that just because you’re fat, it means you’re funny.”
Another sobering fact of AI-generated comedy, according to Hempelmann, is that much of it is, well, mediocre. He pointed out that, although Witscript was able to produce a successful stand-up set for the recent laugh off competition in North Hollywood, it was partly because Toplyn, an expert comedian, provided Witscript’s prompts and then hand-picked only the funniest of the dozens of jokes the AI app originally wrote.
That kind of curation process requires knowledge of both the audience and the performer, Toplyn said. Witscript may write the jokes, but people must decide if it’s funny.
Can AI know when — and why — it’s being funny? It’s a question that nags at many researchers who study computational humor, and its answer is complicated.
ChatGPT wasn’t designed to produce humor, it was designed to generate and predict text — “it can’t feel the emotions associated with laughter and really appreciating a good joke,” Gorenz explained. “But it can still produce funny things,” he said. “It kind of suggests that perhaps all you need is a lot of data and pattern recognition in noticing what makes a funny joke to make a good one yourself. You may not need to appreciate it on your own to create it.”
But for Hempelmann, genuine humor is intertwined with the intentions of the person wielding it. “Humor lets you play with meaning,” Hempelmann said — it allows people to explore multiple ideas without committing to clarity about their intentions. A joke that probes for shared values, but crosses a line into offensive terrain, can be retracted with a “no, no, I didn’t say that, I was just joking,” he said.
Humor, in other words, allows people to gauge one another’s intentions, principles, and emotional baggage in a more covert and subtle manner than they could otherwise. Some preliminary work suggests that the motivations behind a joke — and the human needs it encompasses — may be critical elements of humor: In unpublished data, both Toplyn and Gorenz have found that people find jokes less funny when they are aware that an AI wrote them.
Although AI can be prompted to craft a joke, it will never wield humor to get out of a scrape or explore possibilities, Hempelmann said. “It won’t have the same human needs.” He sees this as a major hurdle between telling jokes and achieving genuine humor. When asked if AI will ever bridge the two, he blows a raspberry with his lips.
“The step that only humans can make,” Hempelmann said, is to have an emotional reaction to humor, even when it is not strictly funny. “You can find it gross, you can find it invasive, you can find it revealing of a new truth,” he added. “All of that, only the human can do.”
This article was originally published on Undark. Read the original article.
