FutureSim benchmark evaluates AI agents on continual learning
FutureSim has launched a new benchmark designed to assess the continual learning capabilities of advanced AI agents. By providing models like GPT-5.5 with sequential news updates, the benchmark evaluates how effectively these agents adjust their predictions in response to new information, measuring both forecast changes and accuracy.