AdsYou
用户研究、用户体验设计
课程名称
05-610 User Research and Evaluation
AdsYou 是一款基于抖音的广告个性化设置插件。通过简单直观的交互提示帮助用户轻松定位和个性化设置广告,以推广用户偏好内容和减少有害偏见。
合作者
Qiaoqiao Ma, Penghua Zhou
项目年份
2023
设计挑战
算法偏见通过准确的推荐为用户带来便利, 但也有可能对其有害。 抖音算法系统中的有害偏见如何影响用户体验?
问题陈述-抖音广告推送功能的可用性测试
我们测试了当前抖音功能的可用性,以发现目前存在的算法偏见问题。
1
种族偏见
2
性别偏见
3
品牌偏见
4
需求偏见
5
对偏见广告
消极对待
抖音只向资深用户推荐有关用户自己种族的视频,但会向新用户推荐其他种族的视频。

一些受试者删除了其他种族的视频,因为他们觉得这些视频不符合他们的生活方式。
男性受试者收到了有关运动鞋、游戏和性暗示的广告,而女性受试者则收到了有关化妆品和购物的广告。
抖音仅向受试者推荐某些品牌的产品,其中包括电子设备和食品。
受试者不断收到有关他们不想或不需要购买的产品的广告。

尽管受试者对这些商品表示 “不感兴趣”,但抖音仍然向他们推荐这些。
大多数受试者只在短时间停留在偏见广告上,然后将其划走。

少部分受试者试图对他们看到的前几则偏见广告点击不感兴趣,但很快就放弃了。
研究目标
面对潜在的算法偏见,我们如何通过个性化广告设置来赋能用户?
定性分析
我们对3名受试者进行了三轮采访,采用了出声思维法试点测试访谈,基于引导叙事半结构化访谈等用户研究方法。

我们记录了这些访谈,并对其使用亲和图用户旅程图进行分析,以获得更普遍的见解。以下是我们最关注的几个问题:
“你对抖音上存在偏见的广告有何看法?”
“你对当前的广告设置用户流程有何看法?正面和负面观点多多益善。”
“你希望抖音将来添加更多广告功能吗?为什么?你认为这如何有助于应对算法偏见并使算法满足你的需求?”
“你对刚才提到的潜在功能有疑虑吗?”
“你希望我们对广告设置界面/用户流程进行哪些改进?”
数据分析整合 - 笔记解读 + 亲和图
通过对采访笔记的解读,我们分析用户的需求、动机和行为,然后将他们分组并标记成亲和图,以第一人称视角整合见解。用户阐明了他们在抖音上的偏好、活动和发现,分享了他们的需求、建议、疑虑和关注点。
affinity diagram
模型-用户旅程图
我们构建了模型 - 用户旅程图,以更好地帮助我们总结和理解用户的立场,为速配设计提供了设计启示。
定量分析
我们还进行了问卷调查,包括3 个类别的 13 个问题: 应用功能、广告和购买行为。调查结果验证了我们从定性研究中得出的初步结论。我们总计收到了 32 份回复, 这些结果帮助我们迭代了结论以及接下来的快速约会。
大多数受试者介于22-25岁之间,使用抖音每天不到一小时
该调查证实了我们在访谈中得出的关于当前广告设置的问题,以及人们对广告设置教程的强烈需求。广告设置应更容易找到,以及互动提示在受访者中很受欢迎。
该调查证实了人们尽管有时从未意识到,但在抖音上还是会遇到偏见广告,包括 从广告中购买产品 (有益偏见)和 接收不当或虚假的广告 (有害偏见)。
该调查揭示了受访者的对推荐广告的兴趣。
设计启示
1
有益与有害的
算法偏见
2
渴望消解偏见
3
多样的广告
偏好设置
4
改善指引界面
5
无缝功能集成
用户喜欢带有有益偏见的广告,但也承认算法偏见带来的相关问题,尽管有时他们没有意识到。
对感兴趣的类别进行个性化设置反映了用户希望通过控制他们遇到的广告类型来消解算法偏见。
用户对不同的广告偏好设置机制表达了兴趣,但更喜欢简单直观的机制。
广告相关操作和设置隐藏太深,为用户带来了繁琐的个性化体验,因此需要改进
用户表达了使用抖音时的无缝体验需求,这需要整合个性化功能和现有功能。
低保真原型
速配设计
我们以设计启示为基础,使用速配设计来帮助我们探索可能的设计方向,验证用户需求并确定风险因素。
大多数受试者对方案一表达认可。
speed dating 01
-大多数参与者对实时反馈感兴趣。
-用户更喜欢简单直观的功能。
-用户更喜欢在抖音“为你推荐”页面上进行简单的互动。
speed dating 02
-我们希望添加插件以消除广告偏见,但大多数人认为它过于复杂。
speed dating 03
-尽管分享偏好的想法比较新颖,但大多数受试者表示困惑。 这引发了他们对隐私和日常社交互动模式的不安。
低保真原型
以最初的速配设计为基础,我们使用低保真原型进行了情景原型测试,我们共招募了7 名受试者。在测试过程中,我们确定了用户体验中的三个关键时刻:打开抖音、连续划走, 他们多次观看完整广告。为了进一步探索这些时刻,我们将交互提示做成了纸质原型并在现实生活中的环境进行了测试  - 要求受试者在自己的手机上使用抖音。
physical prototype
scenario 01
scenario 02
scenario 03
成果评估
原型测试的总体反馈是积极的,我们的设计方案极大地提高了用户的满意度。但是纸质原型的设计存在缺陷,影响了测试流程的连续性。考虑到最终的设计呈现方式,积极的反馈更多地影响了我们的最终决定。
最终设计原型
我们在此介绍针对算法偏见的广告个性化插件的设计方案。由于我们的方案以插件的形式呈现,因此我们利用当前抖音UI设计的组件与设计风格,并将我们的插件集成到当前的用户使用流程中。
场景 1:使用教程
我们将隐藏起来的 “不感兴趣” 按钮移到了右列 “喜爱” 按钮附近。当用户首次打开抖音时,插件将展示此功能的使用教程。
场景 2:交互提示
当用户持续上划跳过视频时,将出现交互提示。根据不同选项,插件做出不同响应:调整推荐(显示更少或更多的相似内容)或导航到举报页面。
场景 3:个性化设置提示
如果用户多次观看完整广告,插件将提示他们选择自己的偏好并引导他们进入广告个性化页面。在这里,用户可以通过选择个性化标签自定义偏好,以便于更好地控制收到的广告。
问题陈述
异常值出现场景
TM的设计缺陷
潜在的设计方向
大型数据集可能发生异常值:很难检测,但可能会对预测结果产生巨大影响。
数据采集过程中的网络摄像头误入者:连续捕获的机制使清楚过程变得困难。
人为引入的多样性:人类无法精确控制异常值的影响,这可能会阻碍预测的准确性。
Teacable Machine 将所有输入值作为训练数据,但无法识别数据集中的异常值。
预测机制隐藏得太深,对不了解机器学习技术的用户不友好
我们可以通过引入人为本的方法帮助用户识别异常值 
简单的预测机制和适当的可解释性可以帮助用户发现潜在异常值的影响。
研究目标
我们如何提供反馈,让用户了解异常值对训练集准确性的影响,从而提供更高质量的训练数据样本?
-(RQ1): 界面如何提醒人们可能不小心将异常值引入了训练数据集?
-(RQ2): 界面如何指导用户利用TeacableMachine提供的原理阐释有效过滤异常值?
原型设计
初始解决方案 01:人工监督分类
SOLUTION 1
初始解决方案 02:被动异常值识别和校正
SOLUTION 2
最终解决方案:“警报” 界面和 “异常值过滤器” 界面。
原型测试
我在 5 名参与者身上测试了原型。1 名参与者拥有计算机科学背景,2 名拥有设计背景(建筑),2名来自跨学科背景(人机交互和认知科学/管理)。他们中的大多数人具有机器学习或统计学的基础知识。
第 1 回合:教程(对照组)
第 2 回合:使用经过人工筛选的训练集训练新模型
第 3 回合:使用新训练集应用新模型
让受试者运行 TM 并观察当前数据集在对指定样本分类的工作原理。
在接下来的回合中,从两个警报界面备选方案中选择最喜欢的,作为异常值过滤器界面的测试原型。
让受试者通过删除清单中建议的异常值手动优化训练集。
再次在新数据集上运行 TM,观察准确度如何变化,同时试图学习异常值范式。
让参与者根据从第 2 轮中学到的异常值范式手动筛选新的数据集,再次运行 TM 并观察预测准确度如何变化。
进行半结构化访谈和问卷调研
paper prototype for prototyping
checklist for human-in-the-loop
“警报”界面的纸质原型
异常值清单
(红色标注:建议异常值,删除标注:受试者选择的异常值)
定量分析
下图展示了5名受试者在3轮测试中对样本的预测精度。已过滤的数据集准确度最高。但是与初始回合相比,由于受试者可能从建议中学习了异常值的范式,手动筛选的精度显著提高。
accuracy result in 3 round prototyping
定性分析
我通过调研问卷和半结构化访谈获得了更多的信息,参与者表达了他们在这个过程中的发现、感受和建议,为后续总结设计启示提供了宝贵的观点。
关于界面设计
对于“警报”界面,受试者首选第二个设计:

- 他们可以从中看到预测准确性,并理解删除异常值的意图。
- 它鼓励他们考虑进一步优化方式以获得更好的性能。

对于“异常值过滤器"界面,参与者将更好预测结果归因于简单明了的设计。
其他发现
受试者对TeacableMachine的算法机制表示的好奇心超出了预期。尽管他们愿意进一步了解该算法,但他们在测试过程中尽量与算法保持距离,具有专业背景的受试者对第二轮的判断充满信心,但第三轮的结果让他们感到失望。
设计启示
“警报” 界面激发了人们了解影响预测表现的因素的欲望。

“异常值过滤器” 界面为用户提供了一种让他们了解如何区分异常值以及它们对模型的影响的设计范例。
迭代建议
更多研究问题
受试者的反馈,尤其是他们的困惑,带来了进一步推测的潜力,其中包括:

- 为什么谷歌将原理阐释放在了高级功能中?

- 如何平衡人为引入的偏见和机器学习模型偏差?

- 过滤系统在不可预测的新数据集上的表现如何?
界面迭代建议
他们还就进一步的界面迭代提供了宝贵的建议,其中包括:
如何查看数据集:

-行布局(集成到TM当前的上传界面)

-页面预览(集成在上传界面后)

如何识别标记的异常值:

-现场标记

-针对所有分类组的异常值分区

-针对每个分类组的异常值分区

如何回溯筛选流程:

-存档分区(备份性能最佳的筛选数据集)

这个项目
Piggyback 原型设计,用于测试社交平台连接中社交机器人的有效性/可爱度
研究目标
社交机器人通过在Twitter上直接发送经过处理的信息来鼓励私人一对一的联系有多有效(有效性),以及人们对这种连接方式(可爱度)的看法(可爱度)如何。
问:什么是一对一连接?
答:促进与陌生人的联系(关注)和关注者之间的联系(聊天)的能力
战略
推荐类似的 朋友/话题 基于 相似性 最近喜欢考试的推文有:
-后续机器人消息如何有效地连接用户?
-参与者对这种方法的态度如何影响他们的行为?
原型设计会议
我根据参与者在Twitter上的地理位置来选择参与者。他们中的大多数人住在匹兹堡。
推荐系统:从推特上抓取
作为推荐机器人的一部分,我编写了一个 python 抓取脚本来计算两个用户之间的相似度。为了找到潜在的推荐用户,我提取了那些喜欢与参与者相同推文的用户,并分析了他们最近点赞的推文的相似之处。如果他们的相似度超过设定的阈值(85%),我将对话中与特定主题相关的前三个主题标签作为连接内容。
scraping algo code
推荐系统:对话树
我建立了一个对话树,根据当前原型设计会话的目的和层级响应来组织如何向参与者发送提示。
conversational tree
推荐系统:人类机器人
完成上述准备工作后,我创建了一个 “官方” Twitter账户,并开始根据回复手动发送后续提示(当时没有自动化技能哈哈)。如果参与者愿意协助该项目,它将收到一份问卷。
twitter accountPrototype effect
原型制作会议:搭便车原型设计和角色访谈
原型设计课程由两部分组成,每个部分有50名参与者。Piggyback II 不包括关注参与者的用户。因为结果没有得到足够的反馈进行分析,所以我进行了角色访谈
定量分析
从关注提示的用户百分比来看,效果并不乐观。我对100名随机参与者进行了测试,但该原型设计的临界质量尚不清楚。

在不增加被屏蔽率(降低可爱度)的同时,后续问题会引导参与者隐含地检查和遵循指令。
Response Rate
定性分析
在采访中,参与者认为以下因素会影响参与者对机器人的回应意愿:

-隐私问题:如果机器人获得认证,将影响用户对其消息的信任。许多潜在的参与者选择对陌生人关闭DM入口,或者在个人资料中不包含DM消息。

-类人关卡:类人形象/反应将获得更多用户的同情。

-算法透明度:不透明的算法会使那些无处收到消息的人感到恐惧。
见解与启示
直接消息是发送机器人消息的一种烦人的方式,可以解释为什么只有某些人会回复机器人。

该原型设计的临界质量尚不清楚,可能需要超过100个样本才能收集足够的反馈进行分析。

后续问题积极提高了参与者的注意力和好奇心。但是,从关注提示的用户百分比来看,效果并不乐观。
下一步
这个原型设计练习几乎不是一次成功的尝试,但它让我对下一步有了很多见解:
连接机器人设计:
-有了更好的编程技能,我可以考虑将整个过程自动化,以获得更大的搭载尺寸。
-没有任何机制鼓励参与者就对话提供反馈(他们是否想接收消息以及他们对此的感受)。同时,对话树还应鼓励参与者隐含地回应机器人提示。

原型制作算法:
-相似度还可能包括个人资料/主题标签/新推文/回复,更合成的算法可能会提高推荐准确性并带来更多积极的反馈。
研究背景
媒体技术的发展和创客文化的兴起极大地改变了人们参与创意艺术实践的方式。一方面,教育过程已从传统的面对面教学转向各种形式的远程教学,这一趋势因新冠疫情而进一步加速。另一方面,以艺术为导向的手工艺教学越来越普及,进一步模糊了专业和业余从业者之间的界限,这降低了手工艺实践对技能深度的要求,这种转变使远程教育更可行,更容易为大众所接受。

人们曾多次尝试使用传感器、多媒体和扩展现实(XR)等不同技术教授手工艺,包括陶艺、纺织品和折纸等多个领域。特别是扩展现实,因其提供了沉浸式体验,并且能够将丰富的信息叠加到现实世界中, 已展示出在远程教学中的潜在优势。但是,目前的文献综述和我们亲身体验的陶艺课程显示,扩展现实在以往手工艺教育中的应用并未能传授手工艺教学的具身特性,这些应用往往忽略了身体(尤其是手)与材料,工具等现实环境之间的相互作用、复杂任务与隐性知识的传以及传统学徒制中的社交参与。

基于这些研究领域的空白,我们想要探究:
(1) 如何使用扩展现实技术可视化隐性知识并学习?
(2) 扩展技术如何模拟传统的面对面学徒制,帮助学生在充满随机情况的复杂工艺制作过程中学习?

为了应对这些挑战,我们采用了人工智能辅助的混合现实 (MR) 技术框架。混合现实使用户能够使用自己的身体与现实世界中的材料进行互动,从而在利用数字技术的同时保持手工艺实践的全面体验。此外,混合现实可以展示多媒体元素,使隐性知识更容易获得和理解。人工智能通过提供基于情境感知的自适应反馈以及基于问答的反馈指导系统来增强这种体验。
研究问题
人工智能辅助下的混合现实陶艺指导系统如何影响陶艺的具身教学过程?
(R1)- 数字技术与手工艺教育的具身特性之间的相互作用如何影响系统的设计
(R2)-不同技能水平和角色的陶艺艺术家如何看待该系统在支持其学习和教学目标方面的作用(R3)- 除此之外此系统对他们的实践还有哪些影响?
研究方法
基于具身交互理论和我们提出的技术框架,我们设计了一款人工智能辅助的混合现实陶艺指导系统。我们的系统基于Meta Quest 3 平台,专注于拉坯,所有级别陶艺实践者的基本技能,也是入门的第一课。该系统允许用户直接在拉坯机上与陶土互动,同时可以看到人工智能所提供的实时反馈。我们使用具身交互的理论框架来构建我们的系统,并通过教师和学习者的共同设计对其进行迭代。

我们的系统包括两个人工智能辅助组件:
(1) 多媒体交互分步教学系统:包括语音指令和沉浸式视频,以及动画、手势模拟和识别进行手势实时指导。

(2) 实时反馈系统:该系统提供基于规则的形态修正、计算机视觉支持的形状比较以及大型语言模型生成的实时建议。

这些组件可以根据不同学习者的技能水平进行整合,从而提供量身定制的教育体验。
Ceramic Guiding System
The system’s functionality is derived from design goals, with specific features informed by findings from the auto-ethnographic study. In this section, we present the system setup and provide a detailed description of its functionality with a brief overview of its technical implementation.
Setup
The system’s setup is simple and intuitive. We set up our ceramic corner with a VEVOR 9.8" LCD Touch Screen Clay Wheel GCJX-008. We use Logitech C920 Webcam to capture the shape data. To ensure clear distinction between the clay and the background, the corner is wrapped in black cloth, and the basin of the pottery wheel has been removed. Our tool is built on Meta Quest 3, utilizing its passthrough mode to allow users to see their surroundings while wearing the device. The setup process is straightforward: users wear the headset, sit in front of the pottery wheel, select the appropriate speed and mode, and initialize the system using the controller. Under the system’s guidance, they then practice making ceramics by hand.
Setup
Functionality
For Novice Learner: Step-by-step Learning

(1) Learning Flow: Watch, Imitate and Practice
Our system provides overlaid holographic gestures, real-time gesture recognition, and gives learners time to practice. From our auto-ethnographic study, we observed that in traditional ceramic class, students observe and follow their instructors. Correct and precise gestures are critical in wheel-throwing due to the high precision required in ceramic making. A small mistake, such as a slight shake, can ruin the current progress. However, traditional teaching methods often struggle to provide real-time feedback during critical incidents. To better restore and enhance the learning flow in a traditional setting, our system adopts a similar procedure: users first watch a holographic animation to observe the correct gestures. They then imitate the correct starting gesture with real-time text-based hand-part suggestions and practice what they have learned using the hologram.
The gesture holograms are created based on schematic diagrams, modeled in Blender, and imported into Unity as prefabs. For gesture recognition, we piggyback the XRHand package by visualizing the gesture recognition algorithm through text-based instruction. The language is simplified to make it easier to understand, ensuring accessibility for learners of varying skill levels.

(2) 3D learning: Hologram, Video and Tips
Ceramic learning is a three-dimensional task, requiring learners to shape a ceramic piece in real space. In traditional studio settings, observation alone is often insufficient to capture the nuanced details of hand movements and their interaction with the clay. Even with hands-on guidance, the learner’s perspective is typically limited to a third-person view. Our system leverages Mixed Reality to create a 3D spatial immersive display, providing 1) first-person holograms overlay to display the relationship between hands and clay, allowing users to closely observe and understand the intricate details of the gesture; 2) third-person demonstration videos and tips from experts to restore the knowledge from traditional studio while conveying tacit knowledge such as wetness, pressure, time and speed. By combining these perspectives, the system enhances the learner’s understanding of both explicit and tacit aspects of ceramic making, bridging the gap between traditional and immersive learning experiences.

(3) Learn from feedback: Correction and Summary
For novice learners, it’s crucial to receive feedback during their learning journey when they 1) encounter difficulties and 2) seek professional evaluation of their skill and how to improve for their next attempt. Our system fulfills these needs by providing corrective guidance during “critical incidents,” replicating the hands-on instructional experience of in-person learning. The system also generates ad-hoc summaries with suggestions of final clay shape.To produce the feedback, we use the Python-based computer vision library OpenCV to capture the shape outline and identify the critical position using Rhino.Python library. Our system evaluates progress through two steps:1) comparing the current shape with the target shape and 2) checking for fundamental indicators of quality, such as symmetry. The spatial data is sent to the system via servers for analysis. Based on the evaluation criteria, the system selects the corresponding correction gesture. The spatial data is also processed in two ways: (1) a Python script calculates the similarity score between the current and target shapes and (2) the OpenAI API generates text-based suggestions using a novice-level prompt.

For Advanced Practitioner: Guiding Assistant

(1) Improvisation
Experienced practitioners have acquired necessary skills to make ceramics. What the system aims to provide is the freedom for these users to work independently while still supporting them in refining their piece. During our ceramic study journey and pilot studies, we observed the need among practitioners to refresh their skills. The advanced mode visualizes the current and target shape in real-time on the side panel for reference, the practitioners have the flexibility to make pottery at their own pace while periodically refer to the update as needed. They can also refresh their wheel-throwing skills by revisiting gesture and shape holograms, reinforcing their techniques throughout their learning journey.

(2) Real-time Shape Guidance
For experienced practitioners, the goal shifts from simply creating a decent ceramic shape to achieving perfection, even in handmade work. Our system provides color-coded overlays to guide them in shaping the clay by comparing the current shape to the target shape during the process. The system uses three colors: red for "push inward," green for "correct shape," and blue for "pull outward."
To generate the color-coded overlay, we use OpenCV to capture the shape outlines and the Rhino library to analyze it, compare it with the target shape, and generate several profile slices at the intersection points between the two outlines. These slices are then swept to create spatial shape points. The spatial data is sent to the system via servers and visualized as an overlay on the anchor, providing real-time guidance to the practitioner.

(3) Real-time Multimodal Suggestion
For experienced practitioners, their advanced knowledge of skills and tools makes them have higher expectations for more detailed and precise instructional assistance. Expanding on the summary in elementary mode, our system provides multimodal suggestions based on essential pottery parts, integrating text, audio, and gesture/shape hologram. This multimodal approach provides more detailed information, making it easier for experienced users to absorb.
To generate suggestions, besides passing spatial data to OpenAI API to get text suggestions, it uses text-to-speech technology to verbalize the given suggestion, it also selects corresponding gestures and correction prefab models from our asset library based on the comparison of the current and target shapes on the essential body parts.
视频演示
User Study
Recruitment and Participants
As mentioned in the methodology section, our user study comprised two phases. In the first phase, we brought our system to traditional ceramic studios, conducted “quick and dirty” short-term ethnographic study with instructors and students, and reflected with them about their practice and the system on site. In the second phase, we refined the system based on observations and feedback from the first phase. We then invited ceramic makers with varying skill levels to participate in on-site prototyping sessions. Following these sessions, we conducted semi-structured interviews and surveys to collect data for analysis. To recruit participants for the first phase, we contacted local ceramic studios and schools. For the second phase, we distributed recruitment posters across our university's facilities, screening participants with different skill levels: novice learners, experienced learners, and instructors. After each session, we broadcast our study with word of mouth. To ensure diversity, we avoided recruiting from the same pool as our pilot study. Cold recruitment through posters across multiple facilities helped mitigate bias by attracting participants from different backgrounds and ensuring they had no prior exposure to the system.Ultimately, for Phase 1, we invited 2 instructors and 4 student participants, conducting two ethnographic study sessions. During each session, participants practiced the same skills three times. For Phase 2, we invited 21 participants, divided into two groups: 8 participants in the experienced group (2 instructors, 4 proficient practitioners, and 2 experienced practitioners), assigned IDs from E-P01 to E-P08, and 13 participants in the novice group, assigned IDs from A-P01 to A-P13.
Procedures
Ethnographic study and On-site reflection

(1) Consent: At the beginning of the study, we provided participants with a detailed explanation of the study’s purpose and procedures and obtained their consent. The study was approved by our local IRB under protocol # STUDY2024_00000325.

(2) Observation: The ethnographic study was conducted 1:2 at ceramic studios, with one researcher observing two participants at a time. The researcher sat alongside the participants and documented their operation with minimum interference to their learning process. Each session lasted approximately one hour.

(3) Comparison and reflection:  Following the session, both instructors and students were invited to watch our demo of the system and participate in an informal discussion. The discussion aimed to explore: (1) What they did during the session and how they reflect on it; (2) Their opinion on the system’s functionality; (3) How they will use the system in their practice; and (4) Feedback on system limitations and suggestions for improvement. The complete interview questions can be found in Appendix X.

On-site Prototyping Session, Semi-Structured Interview and Survey

(1) Consent and Onboarding: At the beginning of the study, we explained the study to participants in detail and got their consent. Our local IRB approved the study under protocol # STUDY2024_00000325. After securing consent, we conducted an onboarding session to guide them familiarize with the headset, the pottery wheel and the material (air-dry clay), which takes about 5 minutes. During the period, they were free to ask questions, seek clarifications to ensure they were comfortable using the system before the study.

(2) Prototyping Session - Task: The prototyping sessions were conducted on a 1:1 basis on CMU campus for both experienced and novice groups. Each session gave 25–30 minutes to participants to complete the same task: making a vase. Participants used different modes depending on their group: For the novice group, participants chose elementary mode and followed the step-by-step tutorial. For the experienced group, participants chose advanced mode and worked independently to the given target shape. During the process, they were free to explore the function they found useful and navigate their making progress. Figure X shows some ceramic pieces made by our participants. Researchers encouraged participants to explore as many features as possible to provide comprehensive feedback but did not mandate their usage. The researcher sat nearby, observing participants’ operations and providing assistance if needed. Figure X shows examples of ceramic pieces created during the sessions. Semi-structured Interview and Survey: After the session, we conducted 20–25-minute semi-structured interview with each participant.

(3) The interview for experienced groups aimed to reflect on (1) their experience with the system; (2) their interaction dynamics with the system, including how they would use it and how it influenced their existing practices; (3) feedback on system limitations and suggestions for improvement. For the novice group, the interview sought to gather insights into (1) participants’ experience with the system; (2) potential usage scenarios. Additionally, participants completed a survey designed to complement the interviews by capturing attitudes and feedback quantitatively. Both the interview protocol and survey were designed based on the design goals in the methodology section. The complete interview questions and survey can be found in Appendix X.
Data Collection and Analysis
For qualitative analysis, during the task sessions, we took hand-written notes to document participant’s activities and reflected those notes on our interviews. We recorded the semi-structured interview and transcribed them using Otter.ai. Following Braun and Clarke’s thematic analysis approach (Clarke & Braun, 2017), we then went through the transcript, conducted interpretation sessions to extract relevant information, and created affinity diagrams to identify, refine and iterate the themes with inductive coding during weekly meetings over two months. For quantitative analysis, we collected survey answer with google form.
Results
We present our analysis results as five themes, both novice and experienced groups have common ground but also different opinions. Our findings reveal discrepancies in how skill levels influence perceptions of the system (Theme 1), uncover tensions between virtual and realities  (Theme 2), reflect the impact of workflows on skill transfer (Theme 3), and demonstrate how the system redefines roles and contexts in craft-making  (Theme 4). These findings indicate the system’s potential to support learning across skill levels while addressing limitations in creativity and improvisation in the craft process. (Theme 5)

Theme 1: Common Process, Different Goals: Craft Skills Across Experience Levels

At the beginning of each interview, we asked both groups to reflect on which step they found easiest and most difficult. Among novice participants, many considered centering to be the easiest step and pulling up the most challenging. (E-P01, E-P03, E-P04, E-P06). In contrast, all experienced participants had the opposite view. A-P01 explained this difference from different goals between two groups: To a novice, mastering each step is more important. Without proper practice, performing harder ones, such as pulling up, is difficult. But experienced people novices focus on mastering each step, while experienced individuals prioritize the final outcome, which is directly affected by centering. “So experienced are going to have a result oriented, novices are going to have a process oriented.” Despite their differing views on specific steps, participants agreed on several aspects of the learning process: All mentioned having a big picture is important. For example, E-P09 reflected on a past mistake: “I didn't really understand the concept of how the air bubbles might affect me later. But at the time, it didn't, that didn't click very well." It's essential to learn by making progress rather than fixing, as what A-P01 observed in her teaching: “They've been repairing the clay, breaking it and replacing it with another one.” They also agree on the importance of practice, which was universally recognized as essential for developing familiarity with the material and process. As A-P02-Pilot shared: “I was just playing with some water in some clay but doing circles and understanding like how the wheel goes both ways and how like your movement affects”, E-P04 emphasized the importance of learning through failure: “Maybe everyone needs to experience how it goes wrong before they master it.” A-P01 highlighted how positive results from practice encourage beginners: “I think it's very rewarding for beginners to be able to fine-tune the steps and then slowly pull out a shape.”

Theme 2: Tensions Between Virtual and Realities in Embodied Craft-Making

(1) Video vs Gesture: Real-world Resolution and Tacit Knowledge

The system provides both video and gesture-based guidance for learning each step. Participants generally felt that gestures effectively helped them understand what actions to perform and where to position their hands. For example, E-P04 found this very helpful: “What's most helpful was the initial gesture, like where the finger should be put on, which part of the ceramic.” However, participants noted that the gesture instructions lacked detail regarding the interaction between the hand and clay. Some were unsure about which part of the hand to use or how to apply pressure. E-P09 expressed his confusion: “It shows you where your hand should go, but it doesn't say what part of your hand is supposed to be touching the clay.” Some participants also feel the instructions for hand movement are too mechanical. A-P07 commented on the gesture recognition: “...but I have to read a lot of sentences. I don't know how to actually listen to the thing to make changes.” E-P06 feels the gesture animations are too “mathematical”: “It's (gesture) happening in a mathematical way.  it's just going up in a very equal way and it doesn't seem like it's real.” In contrast, participants found videos more helpful because they provided richer details with real-world resolution and tacit knowledge. Videos allowed participants to observe hand gestures, clay shape changes, and the mutual effects more intuitively. E-P08 mentioned her experience when referring to the video: “I saw the video and I tried to make the same thing, and I saw mine, I know it's not perfect, but okay, I think I can move on.” Through video, participants could also infer additional details such as the amount of water required, appropriate speed, and the pressure needed. E-P02 noted: “I can see movement and also the texture of the clay. Sometimes I can judge if I need more water or not.” For some experienced practitioners, video recalled their expertise better than the gesture: A-P07 said: “The system has the hand over the clay, and it does have instructions, but when I watched the video, I could kind of see her, feeling the clay and, shaping it, so that was helpful.”

(2) View and touch in 3D learning: View Angle and Overlay Placement

Participants felt the combination of gesture/shape holograms, videos and tips can help them observe the instruction from different view angles, thus acquiring a better understanding of its spatial shape. E-P04 appreciated this feature, stating: “Just as I feel like it did it well by seeing how it's three dimensional. It’s hollow inside. So it's kind of you get what it is like in a brief structural way.” Besides, the multisource triangulation helped participants avoid cognitive error when following the gestures. E-P02-Pilot reflected on her practice: “because I was trying to follow the video and video is a mirror image, I think I did it the wrong way because I didn't pay enough attention to the hand hologram” Regarding the location of the hologram, participants shared mixed feedback. Some participants feel the shape hologram could help them identify the goal and use them as reference to the current progress. A-P01 said: “... you're able to put the shapes in here (the pottery wheel by the system) so that I can see and compare them, and I think that that is very helpful in terms of practicing..” but some express their concern. On the one hand, the overlay may obstruct the view, thus affecting the precision of wheel-throwing. On the other hand, novices feel they have the urge to practice on the hologram, thus sometimes destroying their progress with incomplete skills. E-P09 explained: “Because once you do it on clay, it is very hard to correct on clay… As soon as my hands were in place, I was already touching the clay.” 

(3) Body extension limitation: Navigating Virtual and Physical Information

The system changed the layers of reality during embodied craft practice. While the headset extends the body’s ability to perceive virtual information, it also introduces a conflict between virtual instructions and the physical demands of wheel-throwing. Participants feel they need to constantly move their body to receive the spatially organized information, which sometimes harm the precision requirement for wheel-throwing. A-P07 highlighted this challenge: “So if I came too close to my part, I cannot see instructions. So I might just stay a physical distance away from my pottery to make sure I can read all of it. But that means I cannot look very closely at the pottery, the details.” E-P04 shared a similar concern: “I have to move a lot actively in order to see everything, which is also another layer of inconvenience.” The elimination of real instructors also adds a new layer of communication complexity, thus disturbing the hierarchy in information exchange. Participants felt that interacting with the system disrupted the natural hierarchy of information exchange, leading to information overload and a sense of disorientation. E-P02 expressed frustration with the abundance of simultaneous instructions: “There's a lot of things that just go automatically. There's instruction verbally, and then also with sound, with text. And then there's a gesture. And then there's a countdown.”

Theme 3: System Workflow’s Impact on Embodied Skill Transfer

(1) Immersion and Streamlining: Benefits and Limitations

Participants acknowledged the workflow in the system is immersive and restores real-world experiences to some extent. E-P01-Pilot appreciated how the system engaged learners: “I feel like it will stick more to the learners compared to a person telling you what you need to do… your voice command is not conversational in a sense, but kind of helps me to orient myself in the learning process.” Similarly, E-P08 thought of the system like a video game, expressing enthusiasm for the system: “I play video games. I was very excited to use something like that in video games. It was very nice to use it because I felt like I was playing a video game.” A-P01 noted: “It can't completely replace reality, but it basically reproduces what's in the scene.” Despite its immersive qualities, the workflow introduced certain challenges. The streamline workflow made participants feel compelled to continue even if things went wrong. E-P04 expressed her unease: “The step seems to be very streamlined, but if there is any chance there is an unexpected something there kind of freaked me out.“ Participants also struggled with the lack of big picture. The system provides text prompts for step goals and exit conditions, it did not adequately triangulate these with an overarching view of the process. E-P09 explained a mistake he made due to this limitation: “it said the base should be four to five millimeters thick. so I was aiming for the base of the wall, and it hit me later when I realized my thumb's going really far down here.” Also, due to the same reason, participants are arguing about the timing for using specific functions and transitioning between steps. A-P08 gave some suggestions: “I think checking is good if it reassures that I am ready to move to the next step. But I also wonder if it should tell me that I'm ready to move to the next step without me having to express that” 

(2) Autonomy: Empowerment and Challenges in Knowledge Acquisition

Many participants appreciated the autonomy offered by the system, which stood out compared to traditional education and other learning mediums. This autonomy allowed users to customize their progress while engaged in physical activities. E-P09 expressed his preference for the system: “Even if I had like a video playing, showing me how to do it, this (system) is much better than that because if I'm in the middle of doing it, I can't just pause the video and restart or look for Google tips on what I've done wrong very easily. So that helped a lot.” A-P01-Pilot also held the same opinion: “I like how much autonomy you give the user. Like you can skip certain parts. You can decide. You can basically override the system.” Participants also felt the autonomy from the freedom of trial and error without the pressure of instructors. E-P06 said: “The good thing is you can keep asking again and again, and not have to worry about teacher fatigue or patience. You just keep asking for the same instruction so that's very beneficial” But some participants also point out the downsides of autonomy. The system’s flexibility might lead to users skipping critical steps, resulting in gaps in understanding essential knowledge. A-P02-Pilot indicated: “...because you cannot see anyone else, so if you cannot understand that, you cannot even go past that, and even if you just skip it, still you're missing the technique, the theory behind it, so you cannot actually learn it.”

Theme 4: Shifting Roles and Contexts Through System Intervention

(1) Usage Scenarios: When and How to Use the System

Participants viewed the system as a mediator “in between videos and in-person.”, as E-P02 described. They agree this system could be used as a bridge between instructors and learners. A-P02-Pilot emphasized this potential: “It could be cool to bounce back with a real professor because if a professor can give you some input but you're not that good yet you can try and improve that particular skill with the tool.” Participants believe they can use the system to learn different skills, as E-P02 suggested: “Maybe I'll be able to follow different tutorials to make different shapes. That would be kind of nice.” They can get basic guidance, such as A-P07 noted its usefulness for trying new techniques: “If I'm going to learn a new technique, I might use it and then try it one or twice.” and E-P07 stated: “In the early stages, instead of you alone practicing, you can practice with the AI.” They can practice them before seeing instructors, E-P07 highlighted its value for early practice: “...it can be complemented with ceramic classes if I ever take them, so I can use once a week teacher and twice a week this, because it's no use seeing the teacher again and again if you don't practice.” Participants saw potential for the system to shift the paradigm of craft education by offering personalized and scalable learning. A-P08 observed: “Instead of the human having to teach everything, it's more personalized and nuanced, more contextual of what the student is lacking.” They envisioned batch education, where the system helps establish fundamental skills for all learners, freeing instructors to provide personalized assistance only when necessary. A-P08 explained: “Maybe if it's in a class setting, everyone uses the VR headset at the same time to establish the same understanding. And then for parts that they're struggling with, the human can come in.” Participants also imagined the system being used in diverse contexts beyond education such as hobby, production and dating. A-P01 described the system as ideal for amateurs aiming to learn professionally: “I think it's a hobbyist's aid, it's more for the amateurs who want to learn something, to be able to learn it relatively professionally, and learn it to a very good status.” While recognizing that production requires specialization, participants saw potential for the system to assist in specific steps. A-P01 noted: “a production aid needs to be very specialized, it's a very complicated thing to make, and it (the system) may be an aid to some of the steps.” E-P02-Pilot mentioned its popularity as a creative activity for socializing: “It's a common It's quite a popular dating idea” 

(2) Impact on Usage Scenarios: Comparing System with Instructors

The system's functionality shapes participants' attitudes and assumptions about its intended usage scenarios, particularly when compared to its alternative: real instructors. Thus, we guided them to reflect on the function and do comparison. Participants identified several benefits of learning from real instructors: (1) Real instructors have the ability to provide physical guidance and correction, as E-P04 stated: ”The teacher will be able to directly do it for me, correct it so that I can use a relatively perfect shape. I can have it done before I move on to the next step, but here I have to go with whatever I have.” (2) Real instructors can provide emotional support, E-P02 expressed her wish from the system: “I wish it was asking me ’Are you ready?’ And I'd be like, ‘oh yeah, I'm ready.’ to be more interactive.” (3) Real instructors can provide customized tacit knowledge, such as from E-P02-Pilot: “... a person can understand how much knowledge you have, and also where you are in the process from the start to the beginning.” E-P09 shared: “They would have had experience like, I remember I screwed this up before. And this is how I didn't do that anymore”, and E-P09 added: “They could see if it's too wet or not wet enough, or you need to slow it down and that helped a lot.” (4) Real instructors are more mobile and sensitive. E-P05 mentioned the ability to ask about specific questions like wheel speed: ”you could ask about the speed of the wheel for real-time, get real-time feedback, asking to check, and then getting the feedback.” E-P03-Pilot mentioned instructors' ability to intervene preemptively: “an in-person teacher can intervene before the user even knows to ask for instruction.” E-P03 reflected on the system’s limitations in clarifying doubts: “It gives you certain instructions, but it can't really elaborate on certain things, or you can't ask it any questions”. Participants also acknowledged the system’s unique benefits: (1) providing a knowledge repository of different practices and context, A-P01 Suggested: “I think we can give them more choices, for example, for beginners, which is the easiest way to pull higher and more stable without destroying the center.” (2) Working as recording tools to provide remote asynchronous instruction, A-P01 proposed: “If you have a problem, the teacher can upload the video to the system when he/she is demonstrating”

Theme 5: Improvisation and Creativity: Constraints from craft nature and skill levels

Improvisation and creativity, while closely related, have a nuanced and sometimes conflicting relationship in the context of craft. During the study, most participants felt they were following instructions, leaving no room for creativity. However, the experienced group provided deeper insights: The nature of wheel-throwing as a craft pursuing perfection, which left little space for creativity. A-P02-Pilot said: “a bad thing because sometimes precision doesn't let you be creative” ItsIts circular form also limits the creative possibilities. A-P01-Pilot gave an extensive explanation: “Because everything has to be circular. And you only have creativity in this one dimension.” For ceramic making, most creativities happen after this stage. A-P02-Pilot added “People will do a very creative drawing on it, and they will do other decorations on the piece, or they will glaze it in a certain way.”

The skill level has a very interesting effect on improvisation. Beginners tend to follow the tutorials closely, but when their practice doesn’t work as expected, or as they forget the step due to skill proficiency, they passively improvise to seek a solution. E-P08 described her improvisation: “ Because I tried to cut it, but I think it was not so good, so I tried to shape it a little more.” As they get more skilled, they master the skills and tools to improvise but the pursuit of perfect shape and reliance on past experience confines improvisation. A-P01 reflected on this: “but if I’m an experienced person, and my goal is clear, make this shape, then I don't have this creative process, I only have the process of following the steps to finish it.” Similarly, A-P02-Pilot commented on the comfort of working within known techniques: “And I feel comfortable with the ones that I do know, rather than the ones that you're showing us.”
Discussion
The user study provided valuable insights into how both novice and experienced practitioners perceive the system and its effect on ceramic making. In this section, we will discuss our findings and explore the dynamics between people, technology and tools. Beyond the insights, we will also revisit the teaching/learning paradigm observed in ceramic studios and reflect on how the system differs from in-person apprenticeship and propose the implications for future iterations and improvements.
How does XR + AI affect explicit/tacit knowledge representation and acquisition in embodied craft making?
Our system tried to replicate and enhance what an instructor could do in teaching or guiding ceramic making. The process has two folds. First, we extract the knowledge from instructors in a form the headset could effectively display. Second, we organize them using the design goals in the methodology section as the workflow for people to learn and follow. How does the design of these two steps affect people's making?

For knowledge extraction, Mixed Reality serves as a container for the distilled and visualized knowledge. While wheel-throwing in ceramic making follows a highly standardized procedure due to its nature as a craft, there are diverse steps and techniques to achieve the same goals. Participants suggested that the system could function as a dataset including a wide range of practice. However, this raises critical questions about managing information, particularly for novice learners who may already find the complexity of 3D crafts with the high precision requirement overwhelming. Should the system prioritize building a standard of practice since it’s replicable? If a standardized approach is implemented, what happens when it fails to accommodate personal learning styles or objectives? Striking a balance between keeping consistency and embracing variety could be a huge challenge for the system's next step.

In terms of knowledge acquisition, the system provides autonomy to both learners and instructors, enhancing both individual learning and contextual teaching practices. However, the concept of autonomy in learning should be considered multidimensionally. Horizontally, as the system has achieved, autonomy enables self-paced learning. Vertically, as practitioners gain experience and develop their skills, the system should help align with the mindset and tacit knowledge of experienced practitioners. For instance, can the system effectively help learners realize the critical importance of centering, a skill that often involves tacit judgment which our novice participants tend to overlook? For instructors, the system also provides an opportunity to cultivate empathy for novice learners by supporting their transition through distilled guidance. This bi-directional autonomy: empowering learners while fostering instructors’ empathy, is where AI could process distilled knowledge and bridge these gaps effectively. Ultimately, the system could be not only a learning tool but also as a medium for enhancing pedagogical relationships.
What XR + AI can’t do - The discrepancy between virtual capability and reality needs
While building the system, we chose the essential information, workflow, and modality based on our own expertise and studio observation. However, to what extent can the system replicate the contextual learning experience enabled by XR + AI technology? For instance, findings reveal that many participants struggle with step cutoffs. Interestingly, our observations in the ceramic studio indicate that learners in real-world settings do not face the same challenges. This discrepancy raises questions about the unique limitations of the XR + AI system.

One possible explanation could be the resolution of real-world interactions and the hierarchy of information from the findings. In the ceramic studio, instructors can dynamically adapt their teaching to address learners’ difficulties in real time. While the system offers autonomy by allowing learners to replay steps, the pre-programmed animations and fixed step flow cannot replicate the natural granularity of in-person instruction. Besides, instructors provide feedback and suggestions as needed, creating a hierarchical learning experience. In contrast, the system exposes all features simultaneously and lets learners choose, potentially overwhelming learners and disrupting the sense of progression from human-centered teaching. Although these deficiencies could be alleviated through better design, fully replicating the flexibility and contextual sensitivity of in-person instruction will still be a significant challenge.

Another limitation of the XR + AI system is the ability to communicate with users. Many participants expressed a desire to ask alternative questions during sessions to reduce their confusion. What they sought is a bi-directional conversational approach. While current AI technologies have the potential to support question-answer-based learning, our system’s reliance on passive, order-response models indicate a significant gap in its design. Addressing this limitation opens up several opportunities for improvement.

First, starting with the simplest modality, what is the most natural way to provide easy-to-understand text feedback? In traditional studio sessions, instructors often use metaphors to simplify complex hand movements and reduce learners’ cognitive burden. Could AI generate suitable and accurate metaphors, replicating the tacit expertise built over years of teaching?

Second, how could the system leverage AI to generate multimodal instruction? While our current system integrates text, audio, and gesture holograms using spatial data and LLM-generated text, it struggles to synchronize these modalities. For instance, the system cannot yet distribute gesture holograms based on the generated text. From our observations, most participants relied on interpreting text instructions, but the absence of a robust solution gesture database often led to mismatch. This challenge presents an opportunity to explore dynamic and integrated multimodal suggestions to improve the coherence and precision of instructions.

Third, how can AI determine exit conditions? For in-person teaching, the decision to move to the next step often relies on implicit judgments by instructors. We envision the system’s AI enhancing users’ understanding of these transitions, but current shape-based judgments are inadequate for the flexible and unpredictable nature of wheel-throwing. Novices rarely produce perfect or even decent shapes, and their preferred shapes may not align with system-defined standards. How might the system adapt to this flexibility without rigid constraints? 

Fourth, is the shape alone sufficient to answer all the questions? Many tacit factors such as water, pressure, wheel speed depend heavily on personal experience and are rarely made explicit during in-person teaching. Even if the system collects such data through an integrated sensor system and passes it to an AI agent, replicating real-world conditions while considering individual preferences and expertise is a huge challenge.

Finally, how can AI provide emotional support to users? From our observations of in-person sessions, we identified a typical emotional support paradigm: (1) students expose vulnerability, (2) instructors provide encouragement after successes, and (3) students express confidence or happiness in their outcomes. These emotional exchanges occur naturally between both parties in the teaching-learning process. How can AI provide a similar emotional response, encouraging users to dedicate themselves to the learning process?

While XR + AI may overcome the limitation above in the near future, there are two intrinsic aspects of ceramic making that it cannot replace. First, ceramic making is an irreversible and long-term effort, novice learners need instructors to clean up the “mess” for “critical incident”. Many participants express their need for physical intervention. Second, errors in ceramic making can be systemic and cumulative. While instructors can identify these broader patterns and address them through their ability to see the “big picture,” the system lacks this capability. Achieving the "aha" moment, when learners have the ability by themselves, requires consistent practice over time. These aspects indicate the indispensable role of human presence in ceramic learning.
Beyond XR + AI: Practitioner and Digital Supporting Tools
What’s the difficulty in making digital supporting tool for craft practitioner: analyze from skill and goal
We documented and analyzed how practitioners with different skill levels interacted with our system. Here we refer to the power framework for creative support tools (Li et al., 2023) to analyze their interaction with the system through the concepts of “power-in” and “power-over”. In terms of “power-in”, both novice learners and experienced practitioners primarily followed the system’s instructions and guidance to create ceramics. However, we also observed frequent “power-over” moments. Our findings reveal that users often engaged in unconscious or conscious improvisations, driven by their individual intentions and motivations. Experienced practitioners, in particular, tended to override the system’s presented knowledge by breaking the soft constraint of instructions to leverage the autonomy. This indicates a limitation of the current system, which focuses mainly on providing instructional support for learning and guidance but overlooks the potential to encourage and support improvisation.

Our current system is designed to target both experienced and novice learners. However, during the study, we had two participants who had a solid foundation in wheel-throwing but had been away from practice for a long period. These individuals showed a strong tendency to improvisation after regaining their muscle memory. We also identified one participant who lacked fully developed skills but had reached a stage where their need and impulse for improvisation were increasing.  Their existence raises an important question: how can the system better adapt to users in this intermediate stage, respecting their intentions, motivations, and prior knowledge in a more flexible and responsive way?

Another critical issue to consider is how to leverage unconscious improvisation among novice learners to enhance their skills and experiences. Our findings indicate that most unconscious improvisation arises from the system’s limitations, such as its inability to guide users effectively through critical moments or compensate for their lack of skills. This raises an intriguing question: if the system were sufficiently advanced, could it eliminate unconscious improvisation? At the same time, as the finding indicates, wheel-throwing has diverse approaches. Could unconscious improvisation, in certain cases, provide learners with opportunities to explore and develop creative, alternative skills? If so, how can the system maintain the balance between remedial, error-driven improvisation and encouraging improvisation that leads to meaningful learning and creativity? 

Lastly, we can see the inherent tension between wheel-throwing as a skill and as a medium for creativity. Ceramic making has a double feature. On the other hand, it’s a creative hobby for the mass public. On the other hand,it is a craft essential for professional makers, often tied to production and livelihood. The two ceramic instructors in our study also sell handmade pieces in their studio. In this context, wheel-throwing, as a foundational step in production, often prioritizes precision over improvisation and creativity. However, the appeal of handmade pottery often comes from its creative and personalized qualities, which can emerge not only in post-throwing procedures but also during wheel-throwing. For example, one instructor created a wave pattern on the surface of a piece as an example of integrating creativity directly into wheel-throwing. In that case, how can the system balance the demands of skill practice and creative expression? Participants offered valuable insights for future development, such as gamified challenges or optional advanced steps to encourage improvisation while preserving the integrity of craft practice.
Digital support tools’ new context: Collaboration in education, production and leisure activities 
From the findings, we can envision the future of such digital supporting tools. These tools could be used for remote and elementary craft education, assist with specific steps in handmade production, and even enhance leisure activities such as dating. Regardless of the context, the nature of interaction shifts from traditional human-human engagement to human-computer collaboration. We can see the apprenticeship has been transformed to using the system as a collaborative mediator between instructors and learners. For experienced practitioners, it also functions as a collaborator, providing guidance and creative possibilities. In more social contexts, such as dating, the system could offer additional emotional support for affective interactions. Under this transformation, the system presents an opportunity to serve as a collaborative tool in the near future.
研究背景
 该研究探讨了大语言模型(LLM)幻觉与具身媒介(embodied medium)的纠缠。幻觉作为一个新兴的概念,其以设计为导向的视角 尚待探索。当前的算法体验(AX)原型设计方法提及了算法的负面影响,但没有详细说明如何处理 这些“错误或不可预测” 的结果,它也只探讨了虚拟和有形的算法体验,并从以人为本的设计角度进行探索。因此,本研究提出通过将算法体验扩展到更广泛的具身媒介来设计幻觉体验,并将算法体验原型制作方法用于思辨设计。 为了更好地传达体验,本研究侧重于两种相遇情景:从第一人称角度研究多模态作为体验传达媒介的潜力,并从第三人称角度将思辨设计与实体材料之外的更广泛的具身媒介相结合。
research gap
narrative
当前研究的空白与机遇
叙事媒介 -
第一人称和第三人称
研究问题
当大语言模型幻觉融入日常的具身互动体验时:

(RQ1)-用户如何识别、解读幻觉并与之产生共鸣?

(RQ2)-设计师如何通过来自用户的启示,通过具身媒介和理论来设计幻觉体验?
研究假设
交互过程中的算法经验
ax in process
算法逻辑/机制体验: 这种体验源于算法的工作原理。对于机器学习中基于预测的算法,输入输出关系清晰可见。相比之下,大语言模型通过预测的代币序列生成结果,由于输出基于自然语言,用户对其的影响更间接,从而导致了不同的交互体验。

来自算法生成内容体验: 这种体验侧重于算法生成内容类型。对于大语言模型而言,生成的文本、语音、图像或3D 模型均提供不同的体验,具体取决于特定情景中的模态。

来自人为解读的体验: 用户根据自己的社会技术背景来解释算法输出,从而根据他们与结果的互动方式产生不同的反应和体验。
幻觉体验词汇表
这里的幻觉综合了导致了回复偏差各种技术问题。这些问题起因不同,产生的结果也不同,并且在很大程度上受社会技术背景的影响。我提出以下词汇表作为想象潜在幻觉体验的切入点:

同理心 — 来自技术缺陷的情感体验: 这种视角源于用户对幻觉的解读。有些人表示沮丧,而另一些人则将幻觉视为陪伴,在错误但温和的反馈中寻求情感支持。技术缺陷变成了一种个人的情感体验。

机缘巧合 — 与社交关系产生共鸣的另类体验: 这种体验来自幻觉所导致的意料之外的社会关系,以激发好奇心、同理心和反思。这些偶然的时刻可以帮助用户以有意义的方式与更广泛的社交情景建立联系。

炼金术 — 来自幻觉内容的创造性体验: 在内容生成中,幻觉虽然会生成不准确的结果,但可以激发创造力。用户可能会产生超出他们认知或期望的灵感,将幻觉转化为创造性探索的催化剂。
研究方法
原型制作幻觉体验
原型 01:Moodie Assistant
关键词:同理心,情感投射,解释的模糊性

Moodie Assistant 将算法体验描述为对幻觉的情感反应。该原型采用实体语音助手的形式,但设计了指示幻觉程度的表盘。它还配备了一系列遥控器,使用户/观众可以在对话中以不同的互动方式和精度投射自己的情感体验。不同的角色、用户和受众,在与设备互动时会有不同的情绪反应。原型为我们提供了讨论解释模糊性的媒介。
原型 02:Whisper Web
关键词:机缘巧合,社交偶遇

WhisperWeb 将幻觉体验视为社交偶遇。该原型以聊天助手的形式呈现,通过收集使用者的对话作为上下文 ,用于模拟语言模型的 “训练集”。该原型试图反思当幻觉导致的错误情景反而暗示了不同的社会情景时人们的反应。该原型没有对用户的交互进行直接干预,而是利用可视化媒介来观察和记录幻觉如何导致人与对话代理媒介之间的关系变迁。
原型 03:Mindscape
关键词:炼金术,幻觉转化为创意

Mindscape通过从幻觉中寻求创造性机会来构建体验。该原型是沉浸式平台上的扩展现实应用程序,允许用户使用强化了幻觉的语言模型构思和创建另一个虚拟世界,该模型更多地关注头脑风暴的工作流程:构思与迭代。该原型旨在研究幻觉影响下的生成内容对创造力的影响。该原型摆脱了现实物理世界的限制,最大限度地解放了想象力。
思辨电影 — 体验叙事
(试点)用户研究
通过口耳相传的方法,研究招募了六名受试者,所有受试者都是项目的其他学生或毕业生。所有受试者都有丰富的设计实践经验,熟悉与大语言模型相关的应用程序/工具。他们被分成三组,连续三天进行了三次观察研究。在观察研究和随后的访谈中,研究人员可倾听受试者“说” 的内容,并观察他们 “做了什么”。在使用亲和图和交互式可视化分析收集到的数据后,研究者在接下来的一周举办了一次工作坊,进行研究反思和探索性参与式设计,邀请受试者作为设计专家与用户提出 “解决方案”。
发现
(红色:识别;黄色:解读;蓝色:共鸣)
识别
解读
共鸣
1
大语言模型幻觉特征
自然语言伪装下的虚假回复
对模型能力有限的同情
对人类的期望是否与模型解释保持一致的疑虑
模糊的解读引发的复杂情感反应
混淆微妙的事实扭曲
2
具身媒介对幻觉体验的影响
媒介对幻觉的可解释性
幻觉的来源
与媒体互动产生的同理心
媒介对幻觉的指示能力
模态的可解释性
幻觉与媒介特性的适配度
媒介的学习负担
幻觉和原型技术之间的相互作用
3
幻觉体验中的互动模式
来自无关的回复内容
由无关的或难懂的回复触发
当幻觉与用户的意图、情感和社交距离保持一致时(内在准则)
来自异常的回复模式
由对错误上下文的意外共鸣触发
在尊重算法错误和幻觉之间的界限时(价值判断)
设计启示
1
在幻觉识别
和共鸣的界限
进行原型设计
2
基于幻觉和媒介
的本质进行
原型设计
3
为关键的、有影响力的时刻
进行原型设计
4
以最小的学习负担
进行原型设计
设计师在设计幻觉体验时,需要在清晰识别算法错误和幻觉之间取得平衡,以唤起用户的同理心,与体验建立更深层次的联系。这种平衡可确保用户能够与幻觉产生共鸣而不会被明显的错误分散注意力。
幻觉是否更适合以事实知识或抽象概念进行表达,选取的具身媒介是否增强了其可解释性和用户参与度?原型设计应符合这些特征,不仅可以为未来的设计提供见解,还可以更有效地与受众进行沟通。
虽然一些幻觉时刻,例如事实错误或脱离上下文的回应,是体验的关键,但还有一些是良性或者难以发现的,在原型设计中可能会被忽视。原型设计不应该面面俱到,而应专注于关键时刻以聚焦用户的感知。
当原型设计作为探索设计启示和未来想象的一种探索手段时,设计师应使用快速、易于理解的方法来减轻原型设计媒介带来的客观负担和复杂或不明确的思辨主旨所带来的主观负担。