Rethinking Role-Playing Evaluation: Anonymous Benchmarking and a Systematic Study of Personality Effects


This is a companion discussion topic for the original entry at https://arxiv.org/abs/2603.03915