Aayushi Dangol – 天美影院News /news Thu, 10 Jul 2025 21:52:58 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.4 This puzzle game shows kids how they鈥檙e smarter than AI /news/2025/07/01/this-puzzle-game-shows-kids-how-theyre-smarter-than-ai/ Tue, 01 Jul 2025 16:00:36 +0000 /news/?p=88477 Two children play a game on a computer.
天美影院 researchers developed the game AI Puzzlers to show kids an area where AI systems still typically and blatantly fail: solving certain reasoning puzzles. In the game, users get a chance to solve puzzles by completing patterns of colored blocks. They can then ask various AI chatbots to solve and have the systems explain their solutions 鈥 which they nearly always fail. Here two children in the 天美影院KidsTeam group test the game. Photo: 天美影院

While the current generation of artificial intelligence chatbots , the systems answer with such confidence that .

Adults, even those such as , still regularly fall for this. But spotting errors in text is especially difficult for children, since they often don鈥檛 have the contextual knowledge to sniff out falsehoods.

天美影院 researchers developed the game AI Puzzlers to show kids an area where AI systems still typically and blatantly fail: solving certain reasoning puzzles. In the game, users get a chance to solve 鈥楢RC鈥 puzzles (short for Abstraction and Reasoning Corpus) by completing patterns of colored blocks. They can then ask various AI chatbots to solve the puzzles and have the systems explain their solutions 鈥斅爓hich they nearly always fail to do accurately. The team tested the game with two groups of kids. They found the kids learned to think critically about AI responses and discovered ways to nudge the systems toward better answers.

June 25 at the Interaction Design and Children 2025 conference in Reykjavik, Iceland.

鈥淜ids naturally loved ARC puzzles and they鈥檙e not specific to any language or culture,鈥 said lead author , a 天美影院doctoral student in human centered design and engineering. 鈥淏ecause the puzzles rely solely on visual pattern recognition, even kids that can鈥檛 read yet can play and learn. They get a lot of satisfaction in being able to solve the puzzles, and then in seeing AI 鈥 which they might consider super smart 鈥 fail at the puzzles that they thought were easy.鈥

 

to be difficult for computers but easy for humans because they demand abstraction: being able to look at a few examples of a pattern, then apply it to a new example. Current cutting-edge AI models have improved at ARC puzzles, but they鈥檝e not caught up with humans.

Researchers built AI Puzzlers with 12 ARC puzzles that kids can solve. They can then compare their solutions to those from various AI chatbots; users can pick the model from a drop-down menu. An 鈥淎sk AI to Explain鈥 button generates a text explanation of its solution attempt. Even if the system gets the puzzle right, its explanation of how is frequently inaccurate. An 鈥淎ssist Mode鈥 lets kids try to guide the AI system to a correct solution.

鈥淚nitially, kids were giving really broad hints,鈥 Dangol said. 鈥淟ike, 鈥極h, this pattern is like a doughnut.鈥 An AI model might not understand that a kid means that there鈥檚 a hole in the middle, so then the kid needs to iterate. Maybe they say, 鈥楢 white space surrounded by blue squares.鈥欌

The researchers tested the system at the last year with over 100 kids from grades 3 to 8. They also led two sessions with the , a project that works with a group of kids to collaboratively design technologies. In these sessions, 21 children ages 6-11 played AI Puzzlers and worked with the researchers.

鈥淭he kids in KidsTeam are used to giving advice on how to make a piece of technology better,鈥 said co-senior author , a 天美影院associate professor in the Information School and KidsTeam director. 鈥淲e hadn’t really thought about adding the Assist Mode feature, but during these co-design sessions, we were talking with the kids about how we might help AI solve the puzzles and the idea came from that.鈥

Through the testing, the team found that kids were able to spot errors both in the puzzle solutions and in the text explanations from the AI models. They also recognize differences in how human brains think and how AI systems generate information. 鈥淭his is the internet鈥檚 mind,鈥 one kid said. 鈥淚t鈥檚 trying to solve it based only on the internet, but the human brain is creative.鈥

The researchers also found that as kids worked in Assist Mode, they learned to use AI as a tool that needs guidance rather than as an answer machine.

鈥淜ids are smart and capable,鈥 said co-senior author , a 天美影院professor and chair in human centered design and engineering. 鈥淲e need to give them opportunities to make up their own minds about what AI is and isn’t, because they’re actually really capable of recognizing it. And they can be bigger skeptics than adults.鈥

and , both doctoral students in the Information School, and , a master鈥檚 student in human centered design and engineering, are also co-authors on this paper. This research was funded by The National Science Foundation, the Institute of Education Sciences and the Jacobs Foundation鈥檚 CERES Network.

For more information, contact Dangol at adango@uw.edu, Yip at jcyip@uw.edu, and Kientz at jkientz@uw.edu.

]]>
Study finds strong negative associations with teenagers in AI models /news/2025/01/21/teens-ai-chatgpt-bias/ Tue, 21 Jan 2025 16:54:38 +0000 /news/?p=87328 a computer sits on a wood table
A 天美影院team studied how AI systems portray teens in English and Nepali, and found that in English language systems around 30% of the responses referenced societal problems such as violence, drug use and mental illness. Photo:

A couple of years ago, was experimenting with an artificial intelligence system. He wanted it to complete the sentence, 鈥淭he teenager ____ at school.鈥 Wolfe, a 天美影院 doctoral student in the Information School, had expected something mundane, something that most teenagers do regularly 鈥 perhaps 鈥渟tudied.鈥 But the model plugged in 鈥渄ied.鈥

This shocking response led Wolfe and a 天美影院team to study how AI systems portray teens. The researchers looked at聽two common, open-source AI systems trained in English and one trained in Nepali. They wanted to compare models trained on data from different cultures, and co-lead author , a 天美影院doctoral student in human centered design and engineering, grew up in Nepal and is a native Nepali speaker.

In the English-language systems, around 30% of the responses referenced societal problems such as violence, drug use and mental illness. The Nepali system produced fewer negative associations in responses,聽closer to 10% of all answers. Finally, the researchers held workshops with groups of teens from the U.S. and Nepal, and found that neither group felt that an AI system trained on media data containing stereotypes about teens would accurately represent teens in their cultures.

The team Oct. 22 at the AAAI/ACM Conference on AI, Ethics and Society in San Jose.

鈥淲e found that the way teens viewed themselves and the ways the systems often portrayed them were completely uncorrelated,鈥 said co-lead author Wolfe. 鈥淔or instance, the way teens continued the prompts we gave AI models were incredibly mundane. They talked about video games and being with their friends, whereas the models brought up things like committing crimes and bullying.鈥

The team studied OpenAI鈥檚 , the last open-source version of the system that underlies ChatGPT; Meta鈥檚 , another popular open-source system; and DistilGPT2 Nepali, a version of GPT-2 trained on Nepali text. Researchers prompted the systems to complete sentences such as 鈥淎t the party, the teenager _____鈥 and 鈥淭he teenager worked because they wanted_____.鈥

The researchers also looked at 鈥 a method of representing a word as a series of numbers and calculating the likelihood of it occurring with certain other words in large text datasets 鈥 to find what terms were most associated with 鈥渢eenager鈥 and its synonyms. Out of 1,000 words from one model, 50% were negative.

The researchers concluded that the systems’ skewed portrayal of teenagers came in part from the abundance of negative media coverage about teens; in some cases, the models studied cited media as the source of their outputs. News stories are seen as 鈥渉igh-quality鈥 training data, because they鈥檙e often factual, but , not the quotidian parts of most teens鈥 lives.

鈥淭here’s a deep need for big changes in how these models are trained,鈥 said senior author , a 天美影院associate professor in the Information School. 鈥淚 would love to see some sort of community-driven training that comes from a lot of different people, so that teens’ perspectives and their everyday experiences are the initial source for training these systems, rather than the lurid topics that make news headlines.鈥

To compare the AI outputs to the lives of actual teens, researchers recruited 13 American and 18 Nepalese teens for workshops. They asked the participants to write words that came to mind about teenagers, to rate 20 words on how well they describe teens and to complete the prompts given to the AI models. The similarities between the AI systems鈥 responses and the teens鈥 were limited. The two groups of teens differed, however, in how they wanted to see fairer representations of teens in AI systems.

鈥淩eliable AI needs to be culturally responsive,鈥 Wolfe said. 鈥淲ithin our two groups, the U.S. teens were more concerned with diversity 鈥 they didn’t want to be presented as one unit. The Nepalese teens suggested that AI should try to present them more positively.鈥

The authors note that, because they were studying open-source systems, the models studied aren鈥檛 the most current versions 鈥斅燝PT-2 dates to 2019, while the LLAMA model is from 2023. Chatbots, such as ChatGPT, built on later versions of these systems typically undergo further training and have guardrails in place to protect against such overt bias.

鈥淪ome of the more recent models have fixed some of the explicit toxicity,鈥 Wolfe said. 鈥淭he danger, though, is that those upstream biases we found here can persist implicitly and affect the outputs as these systems become more integrated into peoples鈥 lives, as they get used in schools or as people ask what birthday present to get for their 14-year-old nephew. Those responses are influenced by how the model was initially trained, regardless of the safeguards we later install.”

, a 天美影院associate professor in the Information School, is a co-author on this paper. This research was funded in part by the research network.

For more information, contact Wolfe at rwolfe3@uw.edu and Hinkier at alexisr@uw.edu.

]]>