MRClay: Probing XR+AI Tools in Embodied Craft Learning and Teaching Dynamics
Design Technology Research
COURSE NAME
Graduate Independent Study for School of Design, CMU
The rise of media technology and maker culture has transformed how people engage in creative practices, especially in craft education. Remote learning has made craft disciplines, like pottery, more accessible, but traditional hands-on learning still poses challenges for technological solutions. Existing attempts to teach craft through extended reality (XR) often overlook the embodied and tactile nature of craft learning. In response, we introduce an AI-augmented Mixed Reality (MR) ceramic guiding system focused on wheel throwing, a foundational skill in ceramics. Our system combines MR’s immersive interaction with AI-driven, real-time feedback, making tacit knowledge more accessible to learners. Through co-design with ceramic instructors and learners, the system offers personalized, step-by-step guidance and adaptive feedback, supporting various skill levels. This research aims to explore how AI-augmented MR can enhance the embodied teaching-learning process in craft education and its broader implications for the field.
COLLABORATORS
Steve Hu
PROJECT YEAR
2023 - Present
The challenge
Bias brings users convenience through accurate recommendations while potentially harming them from different perspectives. How does harmful biases in TikTok’s Algorithmic Systems affect user experience?
Problem Statement - Usability test on Tiktok's ad
We tested the usability on current TikTok functionalities to find the bias issue among the users.
1
Racial Bias
2
Gender Bias
3
Brand Bias
4
Need Bias
5
Negative Action to Biased Ads
TikTok only recommend posts about users' own race to experienced participants but recommend posts of other races to the naive participants.

Some participants swipe away the other races’ posts because they feel these posts don’t fit their lifestyle.
The male participants got advertisements about sport shoes, games and sexual contexts, while the female participants got advertisements about cosmetics and shopping malls.
TikTok only recommends certain brands of products to participants, which including electronic device and food in our test.
Participants kept receiving ads on the products they don’t want or need to buy.

Even though participants pressed "not interested" to these items, TikTok still recommended these items to them.
Most participants only stay a short time on biased advertisement and swiped them away.

Few of them tried to disinterest the first few biased advertisements they saw but gave up after a while.
RESEARCH GOALS
How might we support users by personalizing their advertisement settings while facing potential algorithm bias?
Qualitative Analysis
We have conducted 3 rounds of interview, including a Think-aloud protocol, a pilot testing interview, and a semi-structured interview with direct-storytelling method on 3 participants.

We collected interpretation notes from those interviews, analyzed them through Affinity Diagram and User Journey Map, and generated higher levels of insights. Here are a few questions that we pay most attention to:
"How do you feel about the advertisement in TikTok regarding algorithm bias?"
"How do you feel about the current advertisement setting user flow? You can share both the positive and negative views with us."
"Are there more functionalities about advertisement you wish TikTok to add in the future? Why? How do you think this can contribute to deal with the algorithm bias and make the algorithm suit your needs?"
"Is there any concern for the possible functionality you just mentioned?"
"What improvement would you like us to do to the advertisement setting interface/user flow?"
Data analysis and synthesis - Interpretation Session + Affinity Diagram
Through interpretation session, we analyze users' need, motivation and behavior, then we grouped and labeled them in an affinity diagram, synthesizing the insights in first-person angle. Users claimed their preference, activities and findings in TikTok, shared their needs, suggestion, complaint, and concern.
affinity diagram
Models - User Journey Map
We built models, user journey map, to help us better summarize and understand users' stance from the interview. This bridges the opinions from users and the following speed dating design implications.
Quantitative Analysis
We have also conducted a survey with 13 questions in 3 categories: app features, advertisements and purchase behaviors on Google Forms to verify our preliminary insights from qualitative research. We received 32 responses, and these results helped us iterate on our insights and following speed dating session.
Most of our participants are between 22-25 years old and use TikTok less than an hour every day.
The survey proves our findings from the interview about the current issues of ad settings and people’s strong need for a tutorial. The ad settings should be easier to find, and the interactive hints are popular among respondents.
The survey verifies that people encounter biased ads on TikTok despite sometimes never realizing it, including buying products from ads (good bias) and receiving inappropriate or fake ads (harmful bias).
The survey unveils respondents’ interest to recommendation ads when they use TikTok to purchase items.
Insights
1
Good and bad bias
2
Desire to mitigate bias
3
Diverse ad preference setting
4
Need for improved navigation
5
Seamless function integration
Users enjoy advertisements with good bias and acknowledge bias-related issues despite not realizing them sometimes.
Personalizing interested categories reflects users' desire to mitigate the bias by controlling the advertisement types they encounter.
Users welcome diverse advertisement preference-setting mechanisms but prefer simple and intuitive ones.
Deep nested ad-related operations and settings create a cumbersome personalization experience for users, leading to a need for improved navigation.
Seamless experience when using TikTok requires the integration of personalization features and existing functionalities.
Low-Fidelity Prototype
Speed Dating
We used speed dating on our insights to help us explore possible futures, validate needs and identify risk factors.
The first one is recognized by most participants.
speed dating 01
- Most participants express interest in real-time feedback from TikTok.
- Users prefer simple and intuitive functionalities.
- Users prefer simple interactions on their TikTok “for you” page.
speed dating 02
- We expected users to consider adding a plugin to eliminate ad biases, but most regard it as too complex and unnecessary.
speed dating 03
- Despite the surprise the sharing preference idea brings, users mostly feel weird about it.
- It triggers their unsettledness towards privacy and daily social interaction pattern.
LOW-FIdelity Prototype
Building on the initial speed dating concept, we conducted a contextual prototyping session using our lo-fi prototype with 7 participants. During these sessions, we identified three critical moments in their user experience: the moment they open TikTok, when they scroll consecutively, and when they watch an ad to the end multiple times. To further explore these moments, we created a physical overlay for our interactive hints and tested them in real-life environments where people typically use TikTok, directly on participants’ phones.
physical prototype
scenario 01
scenario 02
scenario 03
Evalulation
The overall feedback of the prototyping session is positive, our solution increased users' satisfaction a lot. But there are some flaws in physical prototyping process design, which affect the workflow consistency. Considering the final deliver method, the positive feedback weighted more on our final decision.
Final Design Prototype
Here we present our final design solutions for biased ad personalization. Since our solution is represented in the form of extension, we leverage the similar component/style of current TikTok UI design and integrate our extension into the current user flow.
Scenario 1: Tutorial
We’ve moved the deeply nested "Not Interested" button to the right-hand column, positioned next to the "Like" button. When users open TikTok for the first time, the extension will display an overlay tutorial explaining its functionality.
Scenario 2: Interactive Hints
As users continue scrolling through videos, interactive hints will appear. Depending on their choice, the extension will respond differently by either adjusting recommendations (showing fewer or more of similar content) or navigating to the report page.
Scenario 3: Personalization
If users continue watching an ad without taking any action several times, the extension will prompt them to specify their preferences and guide them to the ad personalization page. Here, users can customize their preferences by selecting personalized tags, giving them more control over the ads they see.
Problem Statement
OUTLIER PROBLEM
TEACHABLE MACHINE DESIGN ARGUMENT
DESIGN IMPLICATION
Large datasets with unexpected outliers: Hard to find but may have huge effect on prediction result.
Webcam intruder from collection process: Consecutive capture make the removal process difficult.
Human-introduced diversity: Human can't precisely control the level of outlier insertion, which may impede the prediction accuracy.
Teachable Machine takes all input as training data but has no ability to recognize outlier from the dataset.
Prediction detail hides too deep for easier understanding of users without ML background
User can recognize outlier if we introduce human-in-the-loop methodology
Simple prediction detail with proper explanation can help user find out the potential outlier’s influence.
RESEARCH GOALS
How might we give feedback to let the user learn how would the outlier affect the accuracy of the trained model and thus provide a higher-quality training sample?
- (RQ1): How could the interface alerts people that they might have accidentally introduced outlier into training dataset?
- (RQ2): How could the interface guide the user effectively filter the outlier suggestions from the Teachable Machine algorithm?
PROTOTYPe design
Initial Solution 01: Human supervised classification
SOLUTION 1
Initial Solution 02: Passive outlier identification and correction
SOLUTION 2
Final solution: "Alert" Interface and "Outlier filter" Interface.
PROTOTYPING SESSION
I tested the prototype on 5 participants. 1 participant comes from CS background, 2 from design background (architecture), and 2 from interdisciplinary background (HCI & cognitive science/management). Most of them have basic knowledge of machine learning or statistics.
Round 1: Tutorial (Control Group)
Round 2: Train new model with human-filtered training set
Round 3:  Apply new model with new training set
Let participants run TM and observe how the current dataset works regard to classifying the designated samples.
Choose the favorite alert system among two alternatives as the test prototype of outlier filter interface in the following rounds.
Let participants optimize the training set manually by removing the suggested outliers in the checklist.
Run TM again on new dataset and observe how the accuracy changes while bearing the outlier pattern in mind.
Let participants manually filter a new dataset based on the learned outlier pattern from Round 2, run TM again and observe how the accuracy changes.
After-session interview and questionnaire
paper prototype for prototyping
checklist for human-in-the-loop
Paper prototype for alert system interface
Outlier checklist
(Red: suggested, Cross: participants' decision)
Quantitative Analysis
The chart below visualizes the prediction accuracy of sample images in 3 rounds for 5 participants. The suggested round (model filtered) has the highest accuracy. But compared with the initial round, the accuracy of manually filtered one has significantly improved due to potential pattern learning from outlier suggestion function.
accuracy result in 3 round prototyping
Qualitative Analysis
The survey and interview after prototyping session gave me more information beyond the prediction accuracy, participants expressed their finding, feeling and suggestion in this process, making design implication synthesis possible.
Interface Design
For alert interface, the second interface is preferred by participants because:
        - They can see the accuracy from it and understand the intention of deleting outliers.
        - It encourages them to think about further optimization for better performance.

For outlier filter interface, participants attributed the better performance to the simple and clear design.
Other Findings
Participants express stronger curiosity to Teachable Machine's mechanism than expected. While they are willing to learn more about it, participants try to keep an interesting distance from the algorithm, round 3 result disappoints participants with professional backgrounds while they are confident about their judgment in round 2.
Insights & Implications
The “alert” interface stimulates their desire to learn what affect their prediction performance.

The “outlier filter” interface provides a possible paradigm for users to learn about how to distinguish outliers and what effect they have on the model.
NEXT STEP
More Research Questions
Participants' feedback, especially their confusions, also brought some interesting questions for further speculation, which includes:

        - Why Google puts “under the hood” into the advanced function?

        - How to balance human bias (users’ choice) and ML model bias (algorithm bias)?

        - How well could the filter system perform on unpredictable new datasets (generalization)?
Interface Iteration Suggestion
They also provided valuable suggestion on the further interface iteration, which includes:
How to view the dataset:

        - Row layout (integrated into the current TM upload section)

        - Page view (added behind the upload section)

How to identify the marked outliers:

        - In-site marking

        - Outliers section for all groups

        - Outliers section each group

How to backtrack:

        - Archive section (backup the best-performance filtered dataset)

the project
Piggyback Prototyping for testing the effectiveness /likeability of social bots in social platform connection
RESEARCH GOALS
How effective does the social bot encourage private one-to-one connections by Direct Message of processed information on Twitter (effectiveness) and how would people feel about the connection method (likeability).
Q: What is one-to-one connection?
A: The ability to promote connections(following) with strangers and connections(chatting) between followers
strategy
Recommends similar friends/topics based on the similarity of recent liked tweets to exam:
- How does the follow-up bot message effectively connect users?
- How does participants' attitude to this method affect their behavior?
PROTOTYPING SESSION
I chose participants based on their geo location on Twitter. Most of them were living in Pittsburgh.
Recommendation System: Scraping from Twitter
As part of the recommendation bot, I wrote a python scraping script to calculate the similarity between two users. To find potential recommended users, I extracted those who liked the same tweet as the participant and analyzed the similarity of their recently liked tweets. If their similarity surpasses the set threshold (85%), I put the first three hashtags related to a specific topic in the conversation as the content for connection.
scraping algo code
Recommendation System: Conversation Tree
I built a conversation tree to organize how to send prompts to participants based on the purpose of current prototyping session and tier response.
conversational tree
Recommendation System: Human bot
After finishing the above preparation, I created an “official” Twitter account and began to send follow-up prompt based on the response manually (don't have the automation skill at that time lol). If the participant was willing to assist this project, it will receive a questionnaire.
twitter accountPrototype effect
Prototyping Session: piggyback prototyping and persona interview
The prototyping session comprised of two parts, each of them has 50 participants. Piggyback Iteration II excluded the users who has followed the participants. Because the result didn't get enough feedback for analysis, I conductedpersona Interview
Quantitative Analysis
The effectiveness is not optimistic from the percentage of users following the prompt. I tested on 100 random participants but the critical mass for this prototyping is unclear yet.

While not increasing the rate of being blocked (reducing likability), the following-up question leads participants to check and follow the instruction implicitly.
Response Rate
Qualitative Analysis
From the interview, the participants thought the following factors affect participants' willingness to respond to the bot:

- The privacy concern: if the bot is certified will affect users' trust on their message.  Many potential participants choose to close the DM entrance to strangers or include no DM message in their profile.

- Human-like level: human-like profile/response will gain more empathy from users.

- Algorithm transparency: opaque algorithm will frighten those who received the message from nowhere.
Insights & Implications
Direct Message is an annoying way to send bot messages and may explain why only some people respond to the bot.

The critical mass for this prototyping is unclear yet, more than 100 samples may be needed to collect adequate feedback for analysis.

The following-up question raises participants' attention and curiosity positively. However, the effectiveness is not optimistic from the percentage of users following the prompt.
NEXT STEP
This prototyping exercise is barely a successful attempt, but it gave me many insights about the next step:
Connection bot design:
- With a better programming skill, I can consider automate the whole process for larger piggyback size.
- There is no mechanism encouraging participants to give feedback about the conversation (if they'd like to receive the message and how they feel about it). Meanwhile, conversation tree should also encourage participants to respond to the bot prompt implicitly.

Prototyping algorithm:
- The similarity can also include profile/hashtag/new tweets/reply, a more synthetic algorithm may improve the recommendation accuracy and lead to more positive feedback.
Background
The development of media technology and the rise of maker culture have significantly transformed how people engage in creative art practice. (Song, 2022)  On the one hand, the education process has shifted from traditional in-person courses to various forms of remote learning, a trend that was further accelerated by the pandemic. On the other hand, the increasing accessibility of art-oriented crafts has blurred the lines between professional and amateur practitioners, potentially reducing the depth of skill for craft practice (Song, 2022; Shiner, 2012). This shift has made remote education more feasible and accessible to a broader audience.

Craft, as a discipline of creative art, has experienced a similar transition. There have been numerous attempts to teach craft using different technologies, such as sensors, multimedia, and extended reality (XR), across various fields including pottery making, textiles, and origami. XR, in particular, has shown potential advantages for remote learning due to the immersive experience it offers and ability to overlay rich information onto the real world. However, our comprehensive literature review and immersive sessions with a ceramic instructor reveal that previous XR applications in craft education have not fully addressed the embodied nature of craft learning. Specifically, these applications often overlook the interaction between the body (particularly the hands) and the real-world context of materials and tools, the transfer of tacit knowledge for complex tasks (Jasche et al., 2021), and the social engagement in traditional apprenticeship (Wood et al., 2009).

Given these gaps, we wonder:
(1) How could tacit knowledge be made visible and learnable using XR technologies?
(2) How could XR technologies help learners learn from "critical incidents" (Groth, 2022) in complex craft processes full of "workmanship of risk" (Dhiman et al., 2024), as they would in traditional in-person apprenticeships?

To address these challenges, we adopted the Mixed Reality (MR) + AI technical framework. MR enables users to interact with real-world materials using their bodies, thereby maintaining the identity and authenticity of craft practices within a digital framework. Additionally, MR synthesizes multimedia elements to make tacit knowledge more accessible and comprehensible. AI enhances this experience by providing context-aware, adaptive feedback that mimics the interactive nature of in-person education through a question-feedback loop.
RESEARCH QUESTION
How does the AI-augmented Mixed Reality (MR) Ceramic Guiding system affect the embodied teaching-learning process in ceramic crafts?
(R1) - How does the interplay between technology and the embodied nature of craft education influence the design of the system?
(R2) - How do ceramic artists of varying skill levels and roles perceive the system's effectiveness in supporting their learning and teaching objectives?
(R3) - What are the alternative effects of the system when used by artists?
Methodology: Research through Design with the system 
We use Research through Design (RtD) as the primary methodology of our research. While we could envision the system as a pedagogical tool for real-world ceramic making education in the near future, we also regard it as a design probe generating knowledge through its designing, implementing and testing. While we proposed functions for the system as a solution for the questions identified in the literature, new problems emerged during our close engagement with the local ceramic community and hobbyists. These interactions shape our discussion on the system’s potential impact. (needs literature) Our RtD process is also highly participatory and instructor-centered, involving the following steps:

(1) Immersive Ceramic Making: We began by immersing ourselves in the ceramic-making process using an auto-ethnographic approach, informed by the “expert learner” concept (Wood et al., 2009). This framework positions the researcher as a mediator between expert craft makers and technical developers/designers.

(2) Key Step Extraction: We identified key steps in traditional studio settings and created schematic diagrams illustrating both the learning process for students and the teaching approaches used by instructors in ceramic-making.

(3) Iterative Development: We interpreted the diagrams into the key features of our system. Through this process, we conducted pilot testing sessions with novice learners. We also consulted instructors for professional feedback, refining the system iteratively.

(4) Ethnographic Study in Ceramic Studios: Once a functional demo was ready, we returned to the ceramic studio for short-term ethnographic studies (Hughes et al., 1995). These observations gave us more insights and raised more questions based on our deepened understanding of the process during system development. We introduced the demo to instructors and learners to gather further suggestions.

(5) Prototyping Sessions with Practitioners: In the final phase, we conducted prototyping sessions on ceramic practitioners with different levels of skills, including novices with no prior experience, experienced learners, and instructors. These sessions helped us investigate questions derived from both literature and in-situ observations.
Ceramic Guiding System
The system’s functionality is derived from design goals, with specific features informed by findings from the auto-ethnographic study. In this section, we present the system setup and provide a detailed description of its functionality with a brief overview of its technical implementation.
Setup
The system’s setup is simple and intuitive. We set up our ceramic corner with a VEVOR 9.8" LCD Touch Screen Clay Wheel GCJX-008. We use Logitech C920 Webcam to capture the shape data. To ensure clear distinction between the clay and the background, the corner is wrapped in black cloth, and the basin of the pottery wheel has been removed. Our tool is built on Meta Quest 3, utilizing its passthrough mode to allow users to see their surroundings while wearing the device. The setup process is straightforward: users wear the headset, sit in front of the pottery wheel, select the appropriate speed and mode, and initialize the system using the controller. Under the system’s guidance, they then practice making ceramics by hand.
Setup
Functionality
For Novice Learner: Step-by-step Learning

(1) Learning Flow: Watch, Imitate and Practice
Our system provides overlaid holographic gestures, real-time gesture recognition, and gives learners time to practice. From our auto-ethnographic study, we observed that in traditional ceramic class, students observe and follow their instructors. Correct and precise gestures are critical in wheel-throwing due to the high precision required in ceramic making. A small mistake, such as a slight shake, can ruin the current progress. However, traditional teaching methods often struggle to provide real-time feedback during critical incidents. To better restore and enhance the learning flow in a traditional setting, our system adopts a similar procedure: users first watch a holographic animation to observe the correct gestures. They then imitate the correct starting gesture with real-time text-based hand-part suggestions and practice what they have learned using the hologram.
The gesture holograms are created based on schematic diagrams, modeled in Blender, and imported into Unity as prefabs. For gesture recognition, we piggyback the XRHand package by visualizing the gesture recognition algorithm through text-based instruction. The language is simplified to make it easier to understand, ensuring accessibility for learners of varying skill levels.

(2) 3D learning: Hologram, Video and Tips
Ceramic learning is a three-dimensional task, requiring learners to shape a ceramic piece in real space. In traditional studio settings, observation alone is often insufficient to capture the nuanced details of hand movements and their interaction with the clay. Even with hands-on guidance, the learner’s perspective is typically limited to a third-person view. Our system leverages Mixed Reality to create a 3D spatial immersive display, providing 1) first-person holograms overlay to display the relationship between hands and clay, allowing users to closely observe and understand the intricate details of the gesture; 2) third-person demonstration videos and tips from experts to restore the knowledge from traditional studio while conveying tacit knowledge such as wetness, pressure, time and speed. By combining these perspectives, the system enhances the learner’s understanding of both explicit and tacit aspects of ceramic making, bridging the gap between traditional and immersive learning experiences.

(3) Learn from feedback: Correction and Summary
For novice learners, it’s crucial to receive feedback during their learning journey when they 1) encounter difficulties and 2) seek professional evaluation of their skill and how to improve for their next attempt. Our system fulfills these needs by providing corrective guidance during “critical incidents,” replicating the hands-on instructional experience of in-person learning. The system also generates ad-hoc summaries with suggestions of final clay shape.To produce the feedback, we use the Python-based computer vision library OpenCV to capture the shape outline and identify the critical position using Rhino.Python library. Our system evaluates progress through two steps:1) comparing the current shape with the target shape and 2) checking for fundamental indicators of quality, such as symmetry. The spatial data is sent to the system via servers for analysis. Based on the evaluation criteria, the system selects the corresponding correction gesture. The spatial data is also processed in two ways: (1) a Python script calculates the similarity score between the current and target shapes and (2) the OpenAI API generates text-based suggestions using a novice-level prompt.

For Advanced Practitioner: Guiding Assistant

(1) Improvisation
Experienced practitioners have acquired necessary skills to make ceramics. What the system aims to provide is the freedom for these users to work independently while still supporting them in refining their piece. During our ceramic study journey and pilot studies, we observed the need among practitioners to refresh their skills. The advanced mode visualizes the current and target shape in real-time on the side panel for reference, the practitioners have the flexibility to make pottery at their own pace while periodically refer to the update as needed. They can also refresh their wheel-throwing skills by revisiting gesture and shape holograms, reinforcing their techniques throughout their learning journey.

(2) Real-time Shape Guidance
For experienced practitioners, the goal shifts from simply creating a decent ceramic shape to achieving perfection, even in handmade work. Our system provides color-coded overlays to guide them in shaping the clay by comparing the current shape to the target shape during the process. The system uses three colors: red for "push inward," green for "correct shape," and blue for "pull outward."
To generate the color-coded overlay, we use OpenCV to capture the shape outlines and the Rhino library to analyze it, compare it with the target shape, and generate several profile slices at the intersection points between the two outlines. These slices are then swept to create spatial shape points. The spatial data is sent to the system via servers and visualized as an overlay on the anchor, providing real-time guidance to the practitioner.

(3) Real-time Multimodal Suggestion
For experienced practitioners, their advanced knowledge of skills and tools makes them have higher expectations for more detailed and precise instructional assistance. Expanding on the summary in elementary mode, our system provides multimodal suggestions based on essential pottery parts, integrating text, audio, and gesture/shape hologram. This multimodal approach provides more detailed information, making it easier for experienced users to absorb.
To generate suggestions, besides passing spatial data to OpenAI API to get text suggestions, it uses text-to-speech technology to verbalize the given suggestion, it also selects corresponding gestures and correction prefab models from our asset library based on the comparison of the current and target shapes on the essential body parts.
Demo
User Study
Recruitment and Participants
As mentioned in the methodology section, our user study comprised two phases. In the first phase, we brought our system to traditional ceramic studios, conducted “quick and dirty” short-term ethnographic study with instructors and students, and reflected with them about their practice and the system on site. In the second phase, we refined the system based on observations and feedback from the first phase. We then invited ceramic makers with varying skill levels to participate in on-site prototyping sessions. Following these sessions, we conducted semi-structured interviews and surveys to collect data for analysis. To recruit participants for the first phase, we contacted local ceramic studios and schools. For the second phase, we distributed recruitment posters across our university's facilities, screening participants with different skill levels: novice learners, experienced learners, and instructors. After each session, we broadcast our study with word of mouth. To ensure diversity, we avoided recruiting from the same pool as our pilot study. Cold recruitment through posters across multiple facilities helped mitigate bias by attracting participants from different backgrounds and ensuring they had no prior exposure to the system.Ultimately, for Phase 1, we invited 2 instructors and 4 student participants, conducting two ethnographic study sessions. During each session, participants practiced the same skills three times. For Phase 2, we invited 21 participants, divided into two groups: 8 participants in the experienced group (2 instructors, 4 proficient practitioners, and 2 experienced practitioners), assigned IDs from E-P01 to E-P08, and 13 participants in the novice group, assigned IDs from A-P01 to A-P13.
Procedures
Ethnographic study and On-site reflection

(1) Consent: At the beginning of the study, we provided participants with a detailed explanation of the study’s purpose and procedures and obtained their consent. The study was approved by our local IRB under protocol # STUDY2024_00000325.

(2) Observation: The ethnographic study was conducted 1:2 at ceramic studios, with one researcher observing two participants at a time. The researcher sat alongside the participants and documented their operation with minimum interference to their learning process. Each session lasted approximately one hour.

(3) Comparison and reflection:  Following the session, both instructors and students were invited to watch our demo of the system and participate in an informal discussion. The discussion aimed to explore: (1) What they did during the session and how they reflect on it; (2) Their opinion on the system’s functionality; (3) How they will use the system in their practice; and (4) Feedback on system limitations and suggestions for improvement. The complete interview questions can be found in Appendix X.

On-site Prototyping Session, Semi-Structured Interview and Survey

(1) Consent and Onboarding: At the beginning of the study, we explained the study to participants in detail and got their consent. Our local IRB approved the study under protocol # STUDY2024_00000325. After securing consent, we conducted an onboarding session to guide them familiarize with the headset, the pottery wheel and the material (air-dry clay), which takes about 5 minutes. During the period, they were free to ask questions, seek clarifications to ensure they were comfortable using the system before the study.

(2) Prototyping Session - Task: The prototyping sessions were conducted on a 1:1 basis on CMU campus for both experienced and novice groups. Each session gave 25–30 minutes to participants to complete the same task: making a vase. Participants used different modes depending on their group: For the novice group, participants chose elementary mode and followed the step-by-step tutorial. For the experienced group, participants chose advanced mode and worked independently to the given target shape. During the process, they were free to explore the function they found useful and navigate their making progress. Figure X shows some ceramic pieces made by our participants. Researchers encouraged participants to explore as many features as possible to provide comprehensive feedback but did not mandate their usage. The researcher sat nearby, observing participants’ operations and providing assistance if needed. Figure X shows examples of ceramic pieces created during the sessions. Semi-structured Interview and Survey: After the session, we conducted 20–25-minute semi-structured interview with each participant.

(3) The interview for experienced groups aimed to reflect on (1) their experience with the system; (2) their interaction dynamics with the system, including how they would use it and how it influenced their existing practices; (3) feedback on system limitations and suggestions for improvement. For the novice group, the interview sought to gather insights into (1) participants’ experience with the system; (2) potential usage scenarios. Additionally, participants completed a survey designed to complement the interviews by capturing attitudes and feedback quantitatively. Both the interview protocol and survey were designed based on the design goals in the methodology section. The complete interview questions and survey can be found in Appendix X.
Data Collection and Analysis
For qualitative analysis, during the task sessions, we took hand-written notes to document participant’s activities and reflected those notes on our interviews. We recorded the semi-structured interview and transcribed them using Otter.ai. Following Braun and Clarke’s thematic analysis approach (Clarke & Braun, 2017), we then went through the transcript, conducted interpretation sessions to extract relevant information, and created affinity diagrams to identify, refine and iterate the themes with inductive coding during weekly meetings over two months. For quantitative analysis, we collected survey answer with google form.
Results
We present our analysis results as five themes, both novice and experienced groups have common ground but also different opinions. Our findings reveal discrepancies in how skill levels influence perceptions of the system (Theme 1), uncover tensions between virtual and realities  (Theme 2), reflect the impact of workflows on skill transfer (Theme 3), and demonstrate how the system redefines roles and contexts in craft-making  (Theme 4). These findings indicate the system’s potential to support learning across skill levels while addressing limitations in creativity and improvisation in the craft process. (Theme 5)

Theme 1: Common Process, Different Goals: Craft Skills Across Experience Levels

At the beginning of each interview, we asked both groups to reflect on which step they found easiest and most difficult. Among novice participants, many considered centering to be the easiest step and pulling up the most challenging. (E-P01, E-P03, E-P04, E-P06). In contrast, all experienced participants had the opposite view. A-P01 explained this difference from different goals between two groups: To a novice, mastering each step is more important. Without proper practice, performing harder ones, such as pulling up, is difficult. But experienced people novices focus on mastering each step, while experienced individuals prioritize the final outcome, which is directly affected by centering. “So experienced are going to have a result oriented, novices are going to have a process oriented.” Despite their differing views on specific steps, participants agreed on several aspects of the learning process: All mentioned having a big picture is important. For example, E-P09 reflected on a past mistake: “I didn't really understand the concept of how the air bubbles might affect me later. But at the time, it didn't, that didn't click very well." It's essential to learn by making progress rather than fixing, as what A-P01 observed in her teaching: “They've been repairing the clay, breaking it and replacing it with another one.” They also agree on the importance of practice, which was universally recognized as essential for developing familiarity with the material and process. As A-P02-Pilot shared: “I was just playing with some water in some clay but doing circles and understanding like how the wheel goes both ways and how like your movement affects”, E-P04 emphasized the importance of learning through failure: “Maybe everyone needs to experience how it goes wrong before they master it.” A-P01 highlighted how positive results from practice encourage beginners: “I think it's very rewarding for beginners to be able to fine-tune the steps and then slowly pull out a shape.”

Theme 2: Tensions Between Virtual and Realities in Embodied Craft-Making

(1) Video vs Gesture: Real-world Resolution and Tacit Knowledge

The system provides both video and gesture-based guidance for learning each step. Participants generally felt that gestures effectively helped them understand what actions to perform and where to position their hands. For example, E-P04 found this very helpful: “What's most helpful was the initial gesture, like where the finger should be put on, which part of the ceramic.” However, participants noted that the gesture instructions lacked detail regarding the interaction between the hand and clay. Some were unsure about which part of the hand to use or how to apply pressure. E-P09 expressed his confusion: “It shows you where your hand should go, but it doesn't say what part of your hand is supposed to be touching the clay.” Some participants also feel the instructions for hand movement are too mechanical. A-P07 commented on the gesture recognition: “...but I have to read a lot of sentences. I don't know how to actually listen to the thing to make changes.” E-P06 feels the gesture animations are too “mathematical”: “It's (gesture) happening in a mathematical way.  it's just going up in a very equal way and it doesn't seem like it's real.” In contrast, participants found videos more helpful because they provided richer details with real-world resolution and tacit knowledge. Videos allowed participants to observe hand gestures, clay shape changes, and the mutual effects more intuitively. E-P08 mentioned her experience when referring to the video: “I saw the video and I tried to make the same thing, and I saw mine, I know it's not perfect, but okay, I think I can move on.” Through video, participants could also infer additional details such as the amount of water required, appropriate speed, and the pressure needed. E-P02 noted: “I can see movement and also the texture of the clay. Sometimes I can judge if I need more water or not.” For some experienced practitioners, video recalled their expertise better than the gesture: A-P07 said: “The system has the hand over the clay, and it does have instructions, but when I watched the video, I could kind of see her, feeling the clay and, shaping it, so that was helpful.”

(2) View and touch in 3D learning: View Angle and Overlay Placement

Participants felt the combination of gesture/shape holograms, videos and tips can help them observe the instruction from different view angles, thus acquiring a better understanding of its spatial shape. E-P04 appreciated this feature, stating: “Just as I feel like it did it well by seeing how it's three dimensional. It’s hollow inside. So it's kind of you get what it is like in a brief structural way.” Besides, the multisource triangulation helped participants avoid cognitive error when following the gestures. E-P02-Pilot reflected on her practice: “because I was trying to follow the video and video is a mirror image, I think I did it the wrong way because I didn't pay enough attention to the hand hologram” Regarding the location of the hologram, participants shared mixed feedback. Some participants feel the shape hologram could help them identify the goal and use them as reference to the current progress. A-P01 said: “... you're able to put the shapes in here (the pottery wheel by the system) so that I can see and compare them, and I think that that is very helpful in terms of practicing..” but some express their concern. On the one hand, the overlay may obstruct the view, thus affecting the precision of wheel-throwing. On the other hand, novices feel they have the urge to practice on the hologram, thus sometimes destroying their progress with incomplete skills. E-P09 explained: “Because once you do it on clay, it is very hard to correct on clay… As soon as my hands were in place, I was already touching the clay.” 

(3) Body extension limitation: Navigating Virtual and Physical Information

The system changed the layers of reality during embodied craft practice. While the headset extends the body’s ability to perceive virtual information, it also introduces a conflict between virtual instructions and the physical demands of wheel-throwing. Participants feel they need to constantly move their body to receive the spatially organized information, which sometimes harm the precision requirement for wheel-throwing. A-P07 highlighted this challenge: “So if I came too close to my part, I cannot see instructions. So I might just stay a physical distance away from my pottery to make sure I can read all of it. But that means I cannot look very closely at the pottery, the details.” E-P04 shared a similar concern: “I have to move a lot actively in order to see everything, which is also another layer of inconvenience.” The elimination of real instructors also adds a new layer of communication complexity, thus disturbing the hierarchy in information exchange. Participants felt that interacting with the system disrupted the natural hierarchy of information exchange, leading to information overload and a sense of disorientation. E-P02 expressed frustration with the abundance of simultaneous instructions: “There's a lot of things that just go automatically. There's instruction verbally, and then also with sound, with text. And then there's a gesture. And then there's a countdown.”

Theme 3: System Workflow’s Impact on Embodied Skill Transfer

(1) Immersion and Streamlining: Benefits and Limitations

Participants acknowledged the workflow in the system is immersive and restores real-world experiences to some extent. E-P01-Pilot appreciated how the system engaged learners: “I feel like it will stick more to the learners compared to a person telling you what you need to do… your voice command is not conversational in a sense, but kind of helps me to orient myself in the learning process.” Similarly, E-P08 thought of the system like a video game, expressing enthusiasm for the system: “I play video games. I was very excited to use something like that in video games. It was very nice to use it because I felt like I was playing a video game.” A-P01 noted: “It can't completely replace reality, but it basically reproduces what's in the scene.” Despite its immersive qualities, the workflow introduced certain challenges. The streamline workflow made participants feel compelled to continue even if things went wrong. E-P04 expressed her unease: “The step seems to be very streamlined, but if there is any chance there is an unexpected something there kind of freaked me out.“ Participants also struggled with the lack of big picture. The system provides text prompts for step goals and exit conditions, it did not adequately triangulate these with an overarching view of the process. E-P09 explained a mistake he made due to this limitation: “it said the base should be four to five millimeters thick. so I was aiming for the base of the wall, and it hit me later when I realized my thumb's going really far down here.” Also, due to the same reason, participants are arguing about the timing for using specific functions and transitioning between steps. A-P08 gave some suggestions: “I think checking is good if it reassures that I am ready to move to the next step. But I also wonder if it should tell me that I'm ready to move to the next step without me having to express that” 

(2) Autonomy: Empowerment and Challenges in Knowledge Acquisition

Many participants appreciated the autonomy offered by the system, which stood out compared to traditional education and other learning mediums. This autonomy allowed users to customize their progress while engaged in physical activities. E-P09 expressed his preference for the system: “Even if I had like a video playing, showing me how to do it, this (system) is much better than that because if I'm in the middle of doing it, I can't just pause the video and restart or look for Google tips on what I've done wrong very easily. So that helped a lot.” A-P01-Pilot also held the same opinion: “I like how much autonomy you give the user. Like you can skip certain parts. You can decide. You can basically override the system.” Participants also felt the autonomy from the freedom of trial and error without the pressure of instructors. E-P06 said: “The good thing is you can keep asking again and again, and not have to worry about teacher fatigue or patience. You just keep asking for the same instruction so that's very beneficial” But some participants also point out the downsides of autonomy. The system’s flexibility might lead to users skipping critical steps, resulting in gaps in understanding essential knowledge. A-P02-Pilot indicated: “...because you cannot see anyone else, so if you cannot understand that, you cannot even go past that, and even if you just skip it, still you're missing the technique, the theory behind it, so you cannot actually learn it.”

Theme 4: Shifting Roles and Contexts Through System Intervention

(1) Usage Scenarios: When and How to Use the System

Participants viewed the system as a mediator “in between videos and in-person.”, as E-P02 described. They agree this system could be used as a bridge between instructors and learners. A-P02-Pilot emphasized this potential: “It could be cool to bounce back with a real professor because if a professor can give you some input but you're not that good yet you can try and improve that particular skill with the tool.” Participants believe they can use the system to learn different skills, as E-P02 suggested: “Maybe I'll be able to follow different tutorials to make different shapes. That would be kind of nice.” They can get basic guidance, such as A-P07 noted its usefulness for trying new techniques: “If I'm going to learn a new technique, I might use it and then try it one or twice.” and E-P07 stated: “In the early stages, instead of you alone practicing, you can practice with the AI.” They can practice them before seeing instructors, E-P07 highlighted its value for early practice: “...it can be complemented with ceramic classes if I ever take them, so I can use once a week teacher and twice a week this, because it's no use seeing the teacher again and again if you don't practice.” Participants saw potential for the system to shift the paradigm of craft education by offering personalized and scalable learning. A-P08 observed: “Instead of the human having to teach everything, it's more personalized and nuanced, more contextual of what the student is lacking.” They envisioned batch education, where the system helps establish fundamental skills for all learners, freeing instructors to provide personalized assistance only when necessary. A-P08 explained: “Maybe if it's in a class setting, everyone uses the VR headset at the same time to establish the same understanding. And then for parts that they're struggling with, the human can come in.” Participants also imagined the system being used in diverse contexts beyond education such as hobby, production and dating. A-P01 described the system as ideal for amateurs aiming to learn professionally: “I think it's a hobbyist's aid, it's more for the amateurs who want to learn something, to be able to learn it relatively professionally, and learn it to a very good status.” While recognizing that production requires specialization, participants saw potential for the system to assist in specific steps. A-P01 noted: “a production aid needs to be very specialized, it's a very complicated thing to make, and it (the system) may be an aid to some of the steps.” E-P02-Pilot mentioned its popularity as a creative activity for socializing: “It's a common It's quite a popular dating idea” 

(2) Impact on Usage Scenarios: Comparing System with Instructors

The system's functionality shapes participants' attitudes and assumptions about its intended usage scenarios, particularly when compared to its alternative: real instructors. Thus, we guided them to reflect on the function and do comparison. Participants identified several benefits of learning from real instructors: (1) Real instructors have the ability to provide physical guidance and correction, as E-P04 stated: ”The teacher will be able to directly do it for me, correct it so that I can use a relatively perfect shape. I can have it done before I move on to the next step, but here I have to go with whatever I have.” (2) Real instructors can provide emotional support, E-P02 expressed her wish from the system: “I wish it was asking me ’Are you ready?’ And I'd be like, ‘oh yeah, I'm ready.’ to be more interactive.” (3) Real instructors can provide customized tacit knowledge, such as from E-P02-Pilot: “... a person can understand how much knowledge you have, and also where you are in the process from the start to the beginning.” E-P09 shared: “They would have had experience like, I remember I screwed this up before. And this is how I didn't do that anymore”, and E-P09 added: “They could see if it's too wet or not wet enough, or you need to slow it down and that helped a lot.” (4) Real instructors are more mobile and sensitive. E-P05 mentioned the ability to ask about specific questions like wheel speed: ”you could ask about the speed of the wheel for real-time, get real-time feedback, asking to check, and then getting the feedback.” E-P03-Pilot mentioned instructors' ability to intervene preemptively: “an in-person teacher can intervene before the user even knows to ask for instruction.” E-P03 reflected on the system’s limitations in clarifying doubts: “It gives you certain instructions, but it can't really elaborate on certain things, or you can't ask it any questions”. Participants also acknowledged the system’s unique benefits: (1) providing a knowledge repository of different practices and context, A-P01 Suggested: “I think we can give them more choices, for example, for beginners, which is the easiest way to pull higher and more stable without destroying the center.” (2) Working as recording tools to provide remote asynchronous instruction, A-P01 proposed: “If you have a problem, the teacher can upload the video to the system when he/she is demonstrating”

Theme 5: Improvisation and Creativity: Constraints from craft nature and skill levels

Improvisation and creativity, while closely related, have a nuanced and sometimes conflicting relationship in the context of craft. During the study, most participants felt they were following instructions, leaving no room for creativity. However, the experienced group provided deeper insights: The nature of wheel-throwing as a craft pursuing perfection, which left little space for creativity. A-P02-Pilot said: “a bad thing because sometimes precision doesn't let you be creative” ItsIts circular form also limits the creative possibilities. A-P01-Pilot gave an extensive explanation: “Because everything has to be circular. And you only have creativity in this one dimension.” For ceramic making, most creativities happen after this stage. A-P02-Pilot added “People will do a very creative drawing on it, and they will do other decorations on the piece, or they will glaze it in a certain way.”

The skill level has a very interesting effect on improvisation. Beginners tend to follow the tutorials closely, but when their practice doesn’t work as expected, or as they forget the step due to skill proficiency, they passively improvise to seek a solution. E-P08 described her improvisation: “ Because I tried to cut it, but I think it was not so good, so I tried to shape it a little more.” As they get more skilled, they master the skills and tools to improvise but the pursuit of perfect shape and reliance on past experience confines improvisation. A-P01 reflected on this: “but if I’m an experienced person, and my goal is clear, make this shape, then I don't have this creative process, I only have the process of following the steps to finish it.” Similarly, A-P02-Pilot commented on the comfort of working within known techniques: “And I feel comfortable with the ones that I do know, rather than the ones that you're showing us.”
Discussion
The user study provided valuable insights into how both novice and experienced practitioners perceive the system and its effect on ceramic making. In this section, we will discuss our findings and explore the dynamics between people, technology and tools. Beyond the insights, we will also revisit the teaching/learning paradigm observed in ceramic studios and reflect on how the system differs from in-person apprenticeship and propose the implications for future iterations and improvements.
How does XR + AI affect explicit/tacit knowledge representation and acquisition in embodied craft making?
Our system tried to replicate and enhance what an instructor could do in teaching or guiding ceramic making. The process has two folds. First, we extract the knowledge from instructors in a form the headset could effectively display. Second, we organize them using the design goals in the methodology section as the workflow for people to learn and follow. How does the design of these two steps affect people's making?

For knowledge extraction, Mixed Reality serves as a container for the distilled and visualized knowledge. While wheel-throwing in ceramic making follows a highly standardized procedure due to its nature as a craft, there are diverse steps and techniques to achieve the same goals. Participants suggested that the system could function as a dataset including a wide range of practice. However, this raises critical questions about managing information, particularly for novice learners who may already find the complexity of 3D crafts with the high precision requirement overwhelming. Should the system prioritize building a standard of practice since it’s replicable? If a standardized approach is implemented, what happens when it fails to accommodate personal learning styles or objectives? Striking a balance between keeping consistency and embracing variety could be a huge challenge for the system's next step.

In terms of knowledge acquisition, the system provides autonomy to both learners and instructors, enhancing both individual learning and contextual teaching practices. However, the concept of autonomy in learning should be considered multidimensionally. Horizontally, as the system has achieved, autonomy enables self-paced learning. Vertically, as practitioners gain experience and develop their skills, the system should help align with the mindset and tacit knowledge of experienced practitioners. For instance, can the system effectively help learners realize the critical importance of centering, a skill that often involves tacit judgment which our novice participants tend to overlook? For instructors, the system also provides an opportunity to cultivate empathy for novice learners by supporting their transition through distilled guidance. This bi-directional autonomy: empowering learners while fostering instructors’ empathy, is where AI could process distilled knowledge and bridge these gaps effectively. Ultimately, the system could be not only a learning tool but also as a medium for enhancing pedagogical relationships.
What XR + AI can’t do - The discrepancy between virtual capability and reality needs
While building the system, we chose the essential information, workflow, and modality based on our own expertise and studio observation. However, to what extent can the system replicate the contextual learning experience enabled by XR + AI technology? For instance, findings reveal that many participants struggle with step cutoffs. Interestingly, our observations in the ceramic studio indicate that learners in real-world settings do not face the same challenges. This discrepancy raises questions about the unique limitations of the XR + AI system.

One possible explanation could be the resolution of real-world interactions and the hierarchy of information from the findings. In the ceramic studio, instructors can dynamically adapt their teaching to address learners’ difficulties in real time. While the system offers autonomy by allowing learners to replay steps, the pre-programmed animations and fixed step flow cannot replicate the natural granularity of in-person instruction. Besides, instructors provide feedback and suggestions as needed, creating a hierarchical learning experience. In contrast, the system exposes all features simultaneously and lets learners choose, potentially overwhelming learners and disrupting the sense of progression from human-centered teaching. Although these deficiencies could be alleviated through better design, fully replicating the flexibility and contextual sensitivity of in-person instruction will still be a significant challenge.

Another limitation of the XR + AI system is the ability to communicate with users. Many participants expressed a desire to ask alternative questions during sessions to reduce their confusion. What they sought is a bi-directional conversational approach. While current AI technologies have the potential to support question-answer-based learning, our system’s reliance on passive, order-response models indicate a significant gap in its design. Addressing this limitation opens up several opportunities for improvement.

First, starting with the simplest modality, what is the most natural way to provide easy-to-understand text feedback? In traditional studio sessions, instructors often use metaphors to simplify complex hand movements and reduce learners’ cognitive burden. Could AI generate suitable and accurate metaphors, replicating the tacit expertise built over years of teaching?

Second, how could the system leverage AI to generate multimodal instruction? While our current system integrates text, audio, and gesture holograms using spatial data and LLM-generated text, it struggles to synchronize these modalities. For instance, the system cannot yet distribute gesture holograms based on the generated text. From our observations, most participants relied on interpreting text instructions, but the absence of a robust solution gesture database often led to mismatch. This challenge presents an opportunity to explore dynamic and integrated multimodal suggestions to improve the coherence and precision of instructions.

Third, how can AI determine exit conditions? For in-person teaching, the decision to move to the next step often relies on implicit judgments by instructors. We envision the system’s AI enhancing users’ understanding of these transitions, but current shape-based judgments are inadequate for the flexible and unpredictable nature of wheel-throwing. Novices rarely produce perfect or even decent shapes, and their preferred shapes may not align with system-defined standards. How might the system adapt to this flexibility without rigid constraints? 

Fourth, is the shape alone sufficient to answer all the questions? Many tacit factors such as water, pressure, wheel speed depend heavily on personal experience and are rarely made explicit during in-person teaching. Even if the system collects such data through an integrated sensor system and passes it to an AI agent, replicating real-world conditions while considering individual preferences and expertise is a huge challenge.

Finally, how can AI provide emotional support to users? From our observations of in-person sessions, we identified a typical emotional support paradigm: (1) students expose vulnerability, (2) instructors provide encouragement after successes, and (3) students express confidence or happiness in their outcomes. These emotional exchanges occur naturally between both parties in the teaching-learning process. How can AI provide a similar emotional response, encouraging users to dedicate themselves to the learning process?

While XR + AI may overcome the limitation above in the near future, there are two intrinsic aspects of ceramic making that it cannot replace. First, ceramic making is an irreversible and long-term effort, novice learners need instructors to clean up the “mess” for “critical incident”. Many participants express their need for physical intervention. Second, errors in ceramic making can be systemic and cumulative. While instructors can identify these broader patterns and address them through their ability to see the “big picture,” the system lacks this capability. Achieving the "aha" moment, when learners have the ability by themselves, requires consistent practice over time. These aspects indicate the indispensable role of human presence in ceramic learning.
Beyond XR + AI: Practitioner and Digital Supporting Tools
What’s the difficulty in making digital supporting tool for craft practitioner: analyze from skill and goal
We documented and analyzed how practitioners with different skill levels interacted with our system. Here we refer to the power framework for creative support tools (Li et al., 2023) to analyze their interaction with the system through the concepts of “power-in” and “power-over”. In terms of “power-in”, both novice learners and experienced practitioners primarily followed the system’s instructions and guidance to create ceramics. However, we also observed frequent “power-over” moments. Our findings reveal that users often engaged in unconscious or conscious improvisations, driven by their individual intentions and motivations. Experienced practitioners, in particular, tended to override the system’s presented knowledge by breaking the soft constraint of instructions to leverage the autonomy. This indicates a limitation of the current system, which focuses mainly on providing instructional support for learning and guidance but overlooks the potential to encourage and support improvisation.

Our current system is designed to target both experienced and novice learners. However, during the study, we had two participants who had a solid foundation in wheel-throwing but had been away from practice for a long period. These individuals showed a strong tendency to improvisation after regaining their muscle memory. We also identified one participant who lacked fully developed skills but had reached a stage where their need and impulse for improvisation were increasing.  Their existence raises an important question: how can the system better adapt to users in this intermediate stage, respecting their intentions, motivations, and prior knowledge in a more flexible and responsive way?

Another critical issue to consider is how to leverage unconscious improvisation among novice learners to enhance their skills and experiences. Our findings indicate that most unconscious improvisation arises from the system’s limitations, such as its inability to guide users effectively through critical moments or compensate for their lack of skills. This raises an intriguing question: if the system were sufficiently advanced, could it eliminate unconscious improvisation? At the same time, as the finding indicates, wheel-throwing has diverse approaches. Could unconscious improvisation, in certain cases, provide learners with opportunities to explore and develop creative, alternative skills? If so, how can the system maintain the balance between remedial, error-driven improvisation and encouraging improvisation that leads to meaningful learning and creativity? 

Lastly, we can see the inherent tension between wheel-throwing as a skill and as a medium for creativity. Ceramic making has a double feature. On the other hand, it’s a creative hobby for the mass public. On the other hand,it is a craft essential for professional makers, often tied to production and livelihood. The two ceramic instructors in our study also sell handmade pieces in their studio. In this context, wheel-throwing, as a foundational step in production, often prioritizes precision over improvisation and creativity. However, the appeal of handmade pottery often comes from its creative and personalized qualities, which can emerge not only in post-throwing procedures but also during wheel-throwing. For example, one instructor created a wave pattern on the surface of a piece as an example of integrating creativity directly into wheel-throwing. In that case, how can the system balance the demands of skill practice and creative expression? Participants offered valuable insights for future development, such as gamified challenges or optional advanced steps to encourage improvisation while preserving the integrity of craft practice.
Digital support tools’ new context: Collaboration in education, production and leisure activities 
From the findings, we can envision the future of such digital supporting tools. These tools could be used for remote and elementary craft education, assist with specific steps in handmade production, and even enhance leisure activities such as dating. Regardless of the context, the nature of interaction shifts from traditional human-human engagement to human-computer collaboration. We can see the apprenticeship has been transformed to using the system as a collaborative mediator between instructors and learners. For experienced practitioners, it also functions as a collaborator, providing guidance and creative possibilities. In more social contexts, such as dating, the system could offer additional emotional support for affective interactions. Under this transformation, the system presents an opportunity to serve as a collaborative tool in the near future.
BACKGROUND
The research explores LLM hallucination in the entanglement with embodied medium. As an emerging concept, a design-oriented perspective is yet to be explored. Current algorithmic experience (AX) prototyping method scratched the surface of negative side of algorithm but didn’t elaborate how to deal with “erroneous or unpredictable” result, and it only limits the algorithm in virtual and tangible form and explore it from the human-centered design perspective. Thus, the research proposes to prototype LLM hallucination by expanding the definition to a broader embodied medium and use the AX prototyping method for speculation. To better convey the experience, the research focuses on two encounter scenarios: investigate the potential of multimodality as a convey medium from first-person angle and integrate the speculative design with a broader embodied medium beyond material from third-person angle.
research gap
narrative
Territory Map of current research
Narrative Medium -
first-person and third-person
RESEARCH QUESTION
When LLM hallucinations are integrated into everyday embodied interaction experience:

(RQ1) - How might users recognize, interpret, and relate to LLM hallucination?

(RQ2) - How could designers leverage insights from the users to engage with hallucinations through embodied mediums and methodologies?
Hypothesis
Algorithmic Experience in Interaction Process
ax in process
Experience from algorithm logic/mechanism: This experience stems from how algorithms work. For prediction-based algorithm in machine learning, the input-output relationship clear. In contrast, LLMs generate outputs through predicted token sequences, where user influence is more implicit due to nature-language-based output, creating a different interaction.

Experience from algorithm-generated content: This focuses on the type of content produced by algorithms. For LLMs, text, speech, images, or 3D models each provide distinct experiences, shaped by the modality in specific scenarios.

Experience from human interpretation of output: Users interpret algorithm outputs based on their socio-technical context, leading to varied reactions and experiences shaped by how they engage with the results.
Hallucination Experience Glossary 
LLM hallucination here synthesizes various technical issues that result in deviated responses. These issues stem from different causes and produce varied outcomes, shaped heavily by social context. I propose the following glossary serving as entry points into exploring the imaginative potential of LLM hallucinations:

Empathy – Emotional experience from a technical flaw: This perspective arises from how audiences interpret LLM hallucinations. While some express frustration, others view hallucinated responses as offering companionship, finding emotional support in erroneous yet benign feedback. The technical flaw becomes a personal, emotional experience.

Serendipity – Alternative experience resonating with social relationships: This reframing highlights hallucinations as responses that seem socially connected yet unexpected, fostering curiosity, empathy, and reflection. These moments of serendipity help users connect with broader social contexts in meaningful ways.

Alchemy – Creative experience from hallucinated content: In content generation, hallucinated responses, though factually inaccurate, can spark creativity. Users may find inspiration beyond their knowledge or expectations, turning hallucination into a catalyst for creative exploration.
Method
Prototyping Hallucination Experience
Prototype 01: Moodie Assistant
Key words: Empathy, Emotional projection, Interpretation ambiguity

Moodie Assistant depicts the algorithm experience as an emotional response to hallucination. This prototype takes the form of a voice assistant with a gauge indicating the hallucination level. It’s equipped with a series of remotes to enable users/audiences to express their emotional experience during the conversation with different modalities and granularity on hallucination control. Different roles, users and audience, have different emotional reactions in their interaction with the devices. The prototype’s embodiment gives us the medium to discuss the interpretation ambiguity.
Prototype 02: Whisper Web
Key words: Serendipity, occasional encounter situated in social context

Whisper Web explores hallucination experience as serendipity. The prototype embodies the form of a chatbot with personalized context “collection” to simulate LLM “training set.” The prototype reflects on the reaction when the hallucination provides deviated context implying different social relationships. Instead of intervening directly, the prototype’s embodiment leverages visualization medium to observe and document how the hallucinated info leads to the tension between human and conversational agents.
Prototype 03: Mindscape
Key words: Alchemy, turning hallucinated content into creative ideas, insights, and innovations

Mindscape explores the experience by seeking creative opportunities from LLM hallucination. The prototype is an XR application on the immersive environments platform to let users ideate and create an alternative world with the hallucinated LLM model, which focuses more on the brainstorming workflow and ideation/iteration. The prototype aims to investigate the effect of hallucinated-generated content on creativity. The embodiment mitigates reality constraints and augments imagination to the maximum.
Speculative Film - Experience Narrative
User(pilot) study
Six participants were recruited through word of mouth for the pilot study. All participants are MSCD students/alumni. All participants have rich experience in design practice and are familiar with LLM-related applications/tools. They were paired into three groups, and three observation studies were conducted on three consecutive days. In the observation studies and the following interviews, one can listen to what other people “say” and watch what other people “do”. After analyzing the collected data with affinity diagrams and interactive visualization, a workshop was conducted in the following week for introspection and explorative participatory design to propose “solutions” as both design experts and user role.
findings
(Note: Red: Recognition; Yellow: Interpretation; Blue: Relation)
Recognition
Intepretation
Relation
1
LLM Hallucination Characteristics
False positive from camouflage
Empathy from modal limitation
Doubt from aligning human expectation with model’s interpretation
Complex emotion response from ambiguity
Confusion from factual subtlety
2
Embodied medium's effect on LLM hallucination experience
Medium’s explainability on hallucination
The origin of hallucination
Empathy from engagement with mediums
Medium’s indication ability on hallucination
Interpretability of modality
Match between hallucination and medium nature
Learning burden of medium
Interplay between hallucination and prototyping technique
3
Interactive pattern of LLM hallucination experience
From irrelevant response content
Triggered by irrelevant hard-to-interpret response
When hallucination aligns with user intention, emotion, and social distance (inner norm)
From outlier response pattern
Triggered by resonation with alternative context
When respecting fine line between error and hallucination (value judgment)
insights
1
Prototyping on the fine line of recognition and relation
2
Prototyping hallucination and medium’s nature
3
Prototyping for critical, impactful moments
4
Prototyping with minimal learning burden
Designers need to balance clear recognition of errors and hallucinations in order to evoke empathy and create deeper connections with the experience. This balance ensures that participants can relate to the hallucination without being distracted by obvious mistakes.
is hallucination more suited to factual knowledge or abstract concepts, and does the medium enhance its explanation or engagement? Prototyping should align with these characteristics, not only to inform future designs but also to communicate more effectively with the audience.
While some hallucinated moments, like factual errors or striking out-of-context responses, are key to the experience, others are benign or deeply hidden and can be overlooked in prototyping. Instead of monitoring everything, prototypes should focus on critical moments to enhance users' perception.
When prototyping works as a probe to explore design implications and alternative possibilities, designers should use rapid, easy-to-understand methods to reduce both the objective burden of the prototyping medium and the subjective burden of complex or unclear speculative discourse.