Formation of subgroups and thereby the problem of intergroup bias is well-studied in psychology. Already from the age of five, children can show ingroup preferences. We developed a social robot mediator to explore how a robot could help overcome these intergroup biases, especially for children newly arrived to a country. By utilizing an online evaluation of collaboration levels, we allow the robot to perceive and act upon the current group dynamics. We investigated the effectiveness of the robot's mediating behavior in a between-subject study with 39 children, of whom 13 children had arrived in Sweden within the last 2 years. Results indicate that the robot could help the process of inclusion by mediating the activity. The robot succeeds in encouraging the newly arrived children to act more outgoing and in increasing collaboration among ingroup children. Further, children show a higher level of prosociality after interacting with the robot. In line with prior work, this study demonstrates the ability of social robotic technology to assist group processes.

There is a lot to like about this research and this paper submission. The authors tackle an important problem of out-group social behavior, and they do so with an autonomous system. They also work with children, a user population that is difficult to recruit and study. All of this is commendable. Furthermore, the paper is very well written and easy to follow. The descriptions are for the most part complete and precise. Overall I really liked this work. I want to point out in particular the great Introduction. It provides a good motivation and succinctly conveys the most related work and how it motivated this research. There are a few shortcomings that make this paper somewhat incomplete. The main ones are: 1) The authors to not position their work with respect to a specific gap related work. It is not clear what is missing in the state of the art and thus what the unique contribution of this work is. 2) As a result, the music-based puzzle seems somewhat out of the blue and it is not motivated well. There is some hint in Section III ("We aimed for creating a task..."), but a better description of why this is a good test case for the issues studied is necessary. 3) Most importantly, perhaps, is that it is never quite clear what the game is that the children play with the robot. What is the puzzle they are solving? What is considered a solution. Without this information it is hard to evaluate the validity of the rest of the findings. What would cause a child to choose one action over another is completely opaque in this description. 4) The hypotheses should be presented more clearly and motivated. The word "more" is not clear - more compared to what? Also, an explanation of where the hypotheses come from is missing. 5) Why did the authors need an additional measure beyond the in-game logs? Why is the dictator game important? Couldn't authors find the pro-social behaviors within the game? 6) Finally, I found most of the effects quite small and some going in the counter-hypothesized direction. In that light, the authors' discussion as if the hypotheses were confirmed seems overreaching. Perhaps a more focused experiment on just the out-group socialization with a baseline of in-group behavior would have been a clearer thing to evaluate. Some smaller issues include: 1) The radar graphs are not a very clear way to show the effects, as they mix the geometric metaphor with the scale. It took me a long time to parse what to read in these graphs. 2) The description of the autonomy in overly algorithmic and mathematical terms deducts from the readability of the paper. The same behaviors could have been described more concisely and more clearly in prose. Overall, this is very interesting research, which clearly took great effort and is in the right direction, but it requires some more clarity to be useful as a published paper.

This paper introduces an integrated robotic game and setup. It is a mark of distinction that the experimenters were able to run a user study with actual children from their target population (those in Sweden <2 years). Generally, the research is motivated and presented well well. The authors present a solid and up-to-date overview of recent HRI work on robots promoting/influencing group dynamics. The system description and task are presented clearly. The interaction apparatus strikes me as a good example of interactive group child-robot system design: the game is creative, interesting, appears to be engaging, and is accompanied by a clear algorithmic description of robot behavior. It is somewhat odd to present 3 clearly stated hypotheses for the experiment, and then either not conduct or not present results of statistical tests, which are generally considered the appropriate scientific way to determine whether the experimental hypotheses are supported by the data. I understand that recruiting sufficient participants (particularly when working with a unique target population, which, as noted earlier, is a strength of this work), but in this case, perhaps the standard approach of listing and evaluating hypotheses framed in the form of H1, H2, and H3 is not the right course. That is to say, rather than couching inconclusive results in vague statements that lack mathematical support (e.g., "results show a trend towards the robot being able to achieve a more equal participation during its presence in the first game round", among others), the authors might consider accepting that the data as they are presented do not really answer such questions with much clarity or certainty. Instead, I recommend the authors focus on the many other valuable aspects of the work. What can HRI researchers, in particular those working with groups of children, learn from this work? The design of the interaction and game seems like a good starting point: what was the process like for developing this system and interaction? The game appears to be successful in engaging groups of children for at least a short while, what changes could be made in future work to encourage more clear demonstrations of in-/out-group co-operation? I am quite certain there are valuable reflections for researchers that can be shared as part of this work; given the small sample size of the experiment, I do not think that the currently presented behavioral hypotheses and measures are the most relevant or interesting parts of this project. Questions: How much the system relies on Wizard-of-Oz is unclear, yet important to the overall assessment of the system: how often did the Wizard have to intervene? Minor Points: Abstract: "Results indicate the robot accomplishes to address the inclusive aspect of the group..." awkward wording Introduction: "Today's society is increasingly polarized with growing impressions of 'us' vs 'them'....". Such a claim deserves a citation.