Skip to content

intuitivecomputing/interruption-handling-system

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Interruption Handling System for Conversational Robots

Introduction

This repository accompanies the paper "Interruption Handling for Conversational Robots". It contains the source code for an example of our interruption handling system integrated into an LLM-powered social robot built on the Platform for Situated Intelligence (\psi). We prompt engineered a large language model (GPT-4o-2024-05-13) to generate contextualized robot speech and select fitting facial expressions, head movements, and task actions for a timed decision-making (dessert survival simulation), a contentious discussion (whether the federal government should abolish capital punishment), and a question and answer task (space exploration themed). For more details on the interruption handling system, please refer to our manuscript. For more detailed description of the robot implementation, see the supplemental materials.


Software and Hardware Requirements for Running

Environment:

  • Windows 10 x64

Prerequisites:

  • Python3
  • Microsoft Platform for Situated Intelligence (\psi)
  • node.js
  • React

Contents

In overview, the SocialRobot folder contains files to run the main psi program. The expressive-face-server contains the web app that dispays the robot's face. The google-speech-to-text folder contains the python script that runs the speech recognition and wakeword detection service. The head-server folder contains the head code for the arduino as well as the python head server that communicates between Psi and the Arduino. The web-app folder contains files to run the React web app containing the task information.

SocialRobot

There are three folders each containing the program to run for each task (practice, discussion, survival) in the study. Each folder contains the following files: a Google speech to text component, (GoogleSpeechToTextComponent), a Google text to speech component (GoogleTextToSpeechComponent), a component for classifying the type of interruption (LLMInterruptionHandlingComponent), a component to generate robot behavior (LLMResponseComponent), timer helper components (TimerSecondsComponent), and the main program.

Open the SocialRobot.solutions in visual studios to select the program to run. prompts.json contains all the prompts used in the study by the social robot programs.

expressive-face-server

_face.css _ contains the source code for the hand-crafted robot facial expressions. New facial expressions can be crafted by adjusting the positions and timing of the elements. face.html displays the robot face called by the API. face-testing.html display buttons to select the facial expression to display to help with testing. server.js contains an Express server that takes HTTP requires to change the robot's facial expressions displayed on face.html.

google-speech-to-text

google-speech-to-text-luna-experimental.py contains a program that runs the google speech to text service and detects the wakewords ("Luna" and "stop") from the transcribed interim speech.

head-server

The head_movement_arduino folder contains a head_movement_arduino.ino program that controls three servo motors in the robot. test.py contains a program to help test the connection to the arduino. testServer.py contains a program to help communicate head movements between main psi program and the arduino.

web-app

The task-interface folder contains code for React App displaying the task interface for the study. There is an initial screen where the experimenter may choose the order at which tasks appear. The practice task always appears first. The python backend (server.js) communicates with the Psi program to signal the robot when the task begins and when the user makes changes to the task interface during the survival task.


Usage

Expressive Face

To launch the expressive face component:

  1. Start the expressive face server:
cd expressive-face-server
python server.js
  1. Open face.html in your browser to view the animated robot face.

Task Interface

To launch the task interface:

  1. Start the task server:
cd web-app
python server.js
  1. Run the task interface web application:
cd task-interface
npm start

Questions

For questions that are not covered in this README, we welcome developers to open an issue or contact the author at [email protected].


BibTeX

If you use our system in a scientific publication, please cite our work:

@inproceedings{cao2025Interruption,
  title={Interruption Handling for Conversational Robots},
  author={Cao, Shiye and Moon, Jiwon and Mahmood, Amama and Antony, Victor Nikhil and Xiao, Ziang and Liu, Anqi and Huang, Chien-Ming},
  year={2025}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published