MultiMediate: Multi-modal Behaviour Analysis for Artificial Mediation
Raw Videos and Pre-Computed Features Released!

Raw Videos and Pre-Computed Features Released!

11 April 2025
news

Video data and pre-computed features are now fully available for this year’s cross-cultural multi-domain engagement estimation task. Compared to last year, we extended the available featureset by several state-of-the-art visual feature representations. The features include eGemaps v2, w2vbert2, XLM RoBERTa, OpenFace 2.0 (every frame), OpenPose (every frame), CLIP (768 values per frame, every frame), VideoSwinTransformer (768 values per 16 frames, every 16 frames non-overlapping), VideoMAEv2 (1408 values per 16 frames, every 16 frames non-overlapping), DINOv2+PCA(3) (768x3 values per frame, every 16 frames). In case you already signed the EULA, simply send an email to obtain the download links for the resepective datasets. For MPIIGroupInteraction, please write huajian.qiu@vis.uni-stuttgart.de, for NOXI write to noxi@hcai.eu, and for NOXI+J please write to .