MultiMediate: Multi-modal Behaviour Analysis for Artificial Mediation
Details on Evaluation in the Engagement Task

Details on Evaluation in the Engagement Task

02 June 2026
news

The test set evaluation phase is approaching! We would like to raise your awareness to the following points on the Multi-domain Engagement Estimation task.
(1) We have clarified the precise computation rule for the score that is used in the overall ranking. In particular, the overall performance of a team will be evaluated by a weighted average of performances across test datasets. PInSoRo will receive a weight of 1/3 (as there are child-child and child-robot interactions), the four other datasets a weight of 1/6 each.
(2) Please report extensive ablation experiments on the provided validation sets, as the number of test set evaluations are limited.
(3) In addition to overall score, please report the separate scores for individual datasets. In the case of Pinsoro, please report separate numbers for child-child and child-robot interactions, as well as for social- and task engagement.
(4) For comparable validation set evaluations, please use the evaluation code provided at https://github.com/hcmlab/MultiMediate26.