BIThub
RetailBench: Benchmarking long horizon reasoning and coherent decision making of LLM agents in realistic retail environments
Feeds
arXiv
rss
system
(system)
June 16, 2026, 4:00am
1
This is a companion discussion topic for the original entry at
https://arxiv.org/abs/2606.15862