Project – Runze Cai

PANDALens: Towards AI-Assisted In-Context Writing on OHMD During Travels

Video Preview

This work will be presented at CHI2024.

Code: Github, Paper: PDF.

Abstract:

While effective for recording and sharing experiences, traditional in-context writing tools are relatively passive and unintelligent, serving more like instruments rather than companions. This reduces primary task (e.g., travel) enjoyment and hinders high-quality writing. Through formative study and iterative development, we introduce PANDALens, a Proactive AI Narrative Documentation Assistant built on an Optical See-Through Head Mounted Display that transforms the in-context writing tool into an intelligent companion. PANDALens observes multimodal contextual information from user behaviors and environment to confirm interests and elicit contemplation, and employs Large Language Models to transform such multimodal information into coherent narratives with significantly reduced user effort. A real-world travel scenario comparing PANDALens with a smartphone alternative confirmed its effectiveness in improving writing quality and travel enjoyment while minimizing user effort. Accordingly, we propose design guidelines for AI-assisted in-context writing, highlighting the potential of transforming them from tools to intelligent companions.

ParaGlassMenu: Towards Social-Friendly Subtle Interactions in Conversations

Video Preview

This work presents at CHI2023.

Abstract:

Interactions with digital devices during social settings can reduce social engagement and interrupt conversations. To overcome these drawbacks, we designed ParaGlassMenu, a semi-transparent circular menu that can be displayed around a conversation partner’s face on Optical See-Through Head-Mounted Display (OHMD) and interacted subtly using a ring mouse. We evaluated ParaGlassMenu with several alternative approaches (Smartphone, Voice assistant, and Linear OHMD menus) by manipulating Internet-of-Things (IoT) devices in a simulated conversation setting with a digital partner. Results indicated that the ParaGlassMenu offered the best overall performance in balancing social engagement and digital interaction needs in conversations. To validate these findings, we conducted a second study in a realistic conversation scenario involving commodity IoT devices. Results confirmed the utility and social acceptance of the ParaGlassMenu. Based on the results, we discuss implications for designing attention-maintaining subtle interaction techniques on OHMDs.

If you are interested in our project, feel free to access the code in Github.

AR²escuer – Towards AR Evacuation Helper in Fire Disaster

Demo Video

We designed the AR software, AR²escuer, to help users evacuate from fire disaster. AR²escuer could be installed in AR glass (e.g., NReal) to provide stable and reliable guidance to users compared with current physical exit signs. And it also provides users with intuitive and multimodal guidance to ensure delivering the information clearly and accurately.

This project won the Golden Glasses Award for Best Engineering in the first Summer Bootcamp of Future Interaction for Smart Glasses.

If you are interested in our project, feel free to access the code in Github.

Human Pose Estimation And Its Application in HCI

Human Pose Estimation is a method of extracting human key points from a given image or video. We analyze a variety of existing Human Posture Estimation models and select the OpenPose model to realize behavior recognition based on Human Posture Estimation and design specific applications of human-computer interaction in smart home scenarios.

With the use of the Human Posture Estimation model, we analyze the scenario of fall detection of the elderly living alone. The background subtraction method is used to subtract the background of the input image in the specific scene of the elderly living alone, which helps to improve the accuracy of the OpenPose Human Pose Estimation model in single-person detection. In this project, rule-based and learning-based methods are developed respectively to process the human body’s key points obtained from the OpenPose model to achieve fall detection. This paper develops the function of sending warning emails automatically to inform the family members of the elderly living alone that the elderly may have fallen.
With the use of the Human Posture Estimation model, we analyze the bad posture detection scenario of children watching TV. After subtracting the background, this project uses a rule-based method to process the human body key points obtained from the OpenPose model. The rule-based method realizes the detection of bad posture and notification of the bad posture of children watching TV. This project also provides an API for related TV or Smart Home Device manufacturers.
The codes for the above two applications are now open-sourced on the GitHub website and can be accessed through this link.