Developing and evaluating agents like robots in the setting where they are meant to be deployed can be costly, dangerous, and time-consuming. Therefore, a proxy of the target task domain could be a useful simulation environment.
Image credit: luckey_sun via flickr.com, CC BY-SA 2.0.
A recent paper on arXiv.org investigates the usefulness of a proxy domain by providing proxy usefulness (semi)metrics.
Two types of tasks are distinguished. For proxies used to predict task performance, a metric to quantify the predictivity of a proxy is proposed. It enables researchers to find the most predictive proxy available. The second type of proxy is the data-generating one.
In this case, the researchers introduce a metric that lets to compare different data-generating domains and select the one that yields the best agents. The proposed metrics allow researchers to tune some parameters of their proxy domain for which ground-truth value for the target domain is not available.
In many situations it is either impossible or impractical to develop and evaluate agents entirely on the target domain on which they will be deployed. This is particularly true in robotics, where doing experiments on hardware is much more arduous than in simulation. This has become arguably more so in the case of learning-based agents. To this end, considerable recent effort has been devoted to developing increasingly realistic and higher fidelity simulators. However, we lack any principled way to evaluate how good a “proxy domain” is, specifically in terms of how useful it is in helping us achieve our end objective of building an agent that performs well in the target domain. In this work, we investigate methods to address this need. We begin by clearly separating two uses of proxy domains that are often conflated: 1) their ability to be a faithful predictor of agent performance and 2) their ability to be a useful tool for learning. In this paper, we attempt to clarify the role of proxy domains and establish new proxy usefulness (PU) metrics to compare the usefulness of different proxy domains. We propose the relative predictive PU to assess the predictive ability of a proxy domain and the learning PU to quantify the usefulness of a proxy as a tool to generate learning data. Furthermore, we argue that the value of a proxy is conditioned on the task that it is being used to help solve. We demonstrate how these new metrics can be used to optimize parameters of the proxy domain for which obtaining ground truth via system identification is not trivial.
Research paper: Courchesne, A., Censi, A., and Paull, L., “On Assessing the Usefulness of Proxy Domains for Developing and Evaluating Embodied Agents”, 2021. Link: https://arxiv.org/abs/2109.14516