Highlights from Human-Robot Interaction (HRI) for Learning Conference
Reported by Suant Jane Sezgin
Using the Furhat robot for education: The importance of expressive faces in social robots
presented by Gabriel Skantze
April 23, 2021
Can robots support language learning?
Vogt et al. (2021) conducted a research with 194 Dutch children learning English words to understand if robots assist children’s acquisition of L2 target words. The research also aimed to answer the question if there is a difference between learning with a tablet and learning with a robot and a tablet. According to their research findings, there is evidence showing that learning of L2 words is achieved when children work with a tutoring robot and a tablet. Nevertheless, there was no significant difference when the results regarding learning from a tutoring robot and a tablet are compared to the ones about children’s learning from just the tablet.
Potential explanations of the limited success of use of robots in L2 learning were discussed in this seminar. These points were regarding
-the lack of a comprehensive theory of why a robot would support language learning
-the limited role of the robot that is to comment on the activities on the tablet as a bystander rather than acting as a conversational partner
-understanding the significance of face for conversation
-limitations in technology
Creating the Furhat robot
Although still limited, it is now possible to build better systems thanks to improvements in speech recognition technology. Nevertheless, the impact of face in interaction cannot be underestimated. Gaze, for instance, serves different functions in interaction, such as to decide who the next speaker is, how to communicate joint attention or resolution and to show intimacy or status.
Since lack of a face restricts interaction with the robot, it was later added to the device. In order to add a face with increased facial and cephalic movements, the research team used a mask on the robot and backprojected an animated figure, utilizing the benefits of the previous models such as NAO, Jibo, Sophia and other animated agents. This improvement in the device adds to not only the output, but also the input (figure 1).
Figure 1. PowerPoint slide about what the face adds to the conversation (Skantze, 2021)
Another advantage of having an animated figure on a robot is the ability to explore different personae such as backprojecting a woman, man, child, creature or robot on the mask. (Figure 2)
Figure 2. PowerPoint slide about animated figures on a robot (Skantze, 2021)
Experiments to compare Furhat with other models
-Lip reading comprehension: Comprehension of words under noisy conditions was examined. The comprehension rate was remarkably higher when Furhat read the words (72%) in comparison to a robot with a face mask reading the words (44%). However, the comprehension of words read by Furhat was still less than a human reading the words (84%).
-Mona Lisa effect on turn-taking: Furhat was compared with an animated agent on a flat screen to see participants’ accuracy level of understanding who the robot is addressing in an interaction The rate of turn-taking accuracy is much higher with Furhat in comparison to a 2D agent, the accuracy rates being 84% and 53% respectively.
Furhat has been one of the most commonly used robots in research, allowing researchers to explore robots in various roles such as an educational partner, a museum presenter, a business interviewer, a host on autonomous buses and a concierge at train station. It has not been used in research for language learning; however, according to Skantze (2021), it can provide a very rich set-up for language learning. Therefore, they designed a research experiment of a card game to use it as a testbed for multi-party interaction. In this game, there is a task for two people, which is to sort out cards according to some criteria. Figure 3 shows two children trying to sort out buildings on the cards according to their height.
Figure 3. Powerpoint slide about a card game: a test bed for multi-party interaction (Skantze, 2021)
In this game, having equal roles with the players, the robot was designed to collaborate with people in the game. Therefore, it is possible that the robot might give a wrong answer as well as a right one. It is the players’ decision to fully trust the robot’s opinion or not, which displays similarity with human-to-human interaction.
Another improvement in Furhat’s interaction with humans is regarding limitations of speech recognition while interacting with children. Unlike previous models, Furhat does not say “Sorry, I don’t understand.” or “Could you repeat that again?” when it cannot recognize speech. Instead, the device follows the movements on the tablet and comments on the last moving card, creating an illusion of the robot’s comprehension of what is being discussed.
Robot’s impact on turn-taking
The research findings showed that the robot regulated turn-taking through gaze. When gaze was shared between the two participants, the dominant speaker answered more with a response rate of 67.4% for the dominant speaker and 51.0% for the non-dominant speaker. However, the gap between the responses of dominant and non-dominant speakers was even bigger when the robot’s gaze was on the tablet or when the robot did not produce any questions or comments, response rates being 55.8% and 27.9% respectively. Moreover, apparently, the dominant speakers answered more when the gaze and the questions were directed to them. On the other hand, interestingly, in the opposite situation where the robot shifted its gaze on the non-dominant speakers while asking a question to them, the response rate was much higher for the non-dominant speaker, which was 67.8%. In a nutshell, the results of the study prove that the robot have a clear role of creating balance between speakers’s turn-taking.
The research also focused on the gender and age effect on participants’ turn-taking. Figure 4 demonstrates the comparisons of responses for different ages and genders. It is apparent in the table that adults tend to interact for longer compared to children. What is even more interesting in the findings is that although there wasn’t much difference in the interaction rate of pairs of different genders at similar ages, the results showed significant difference when a child interacted with an adult regardless of gender. Skantze argued that it may be better to allow children to explore on their own when interacting with a robot rather than having an adult with a child.
Figure 4. PowerPoint slide about the effects of age and gender (Skantze, 2021)
The significance of face-to-face interaction cannot be underestimated since it has a potential for supporting language learning. However, in order to take advantage of human-robot interaction in language learning, the use of a robot should be more than “placing a robot next to a tablet.” For this purpose, it is essential to understand “how to achieve face-to-face interactions in conversational settings.” Technically, we need to figure out ways to enable robots to comprehend social signals better and design interactions in a way that robots can express these signals better. Finally, it is possible to design robots contributing to (language) learning only when we have a better pedagogical and psychological understanding of why having a robot is better than working on a tablet.
Skantze, G. (2021, April 23). Using the Furhat robot for education: The importance of expressive faces in social robots [Online conference presentation]. Workshop on HRI for learning, KTH Royal Institute of Technology, Stockholm, Sweden. https://www.kth.se/profile/engwall/page/workshop-on-hri-and-learning-apr...
Vogt, P. et al. (2021, April 23). Studying second language tutoring using social robots: A large-scale study: Some lessons learned [Online conference presentation]. Workshop on HRI for learning, KTH Royal Institute of Technology, Stockholm, Sweden. https://www.kth.se/profile/engwall/page/workshop-on-hri-and-learning-apr...
Images - courtesy of Gabriel Skantze