AI-Assisted Scientific Data Collection with Iterative Human Feedback
Abstract
Although artificial intelligence has revolutionized data analysis, significantly less work has focused on using AI to improve scientific data collection. Past work in AI for data collection has typically assumed the objective function is well-defined by humans before starting an experiment; however, this is a poor fit for scientific domains where new discoveries and insights are made as data is being collected. In this paper we present a new framework to allow AI systems to work together with humans (e.g. scientists) to collect data more effectively in simple scientific domains. We present a novel algorithm, TESA, which seeks to achieve good performance by learning from past human behavior how to direct data to places that are likely to become scientifically interesting in the future. We analyze the problem theoretically, defining a novel notion of regret in this setting and showing that TESA is zero regret. Next, we show that TESA outperforms other related algorithms in simulations using real data drawn from three diverse domains (economics, mental health, and cognitive psychology). Finally, we run experiments with human subjects across these scientific domains to compare our iterative human-in-the-loop process to a (more standard) workflow in which information is communicated to the AI a priori.
Authors
Travis Mandel
James Boyd
Sebastian J. Carter
Randall H. Tanaka
Taishi Nammoto
Resources
AI-Assisted Scientific Data Collection with Iterative Human Feedback
Travis Mandel, James Boyd, Sebastian J. Carter, Randall H. Tanaka, Taishi Nammoto
AAAI Conference on Artificial Intelligence (AAAI 2021)
[Main text (510 KB PDF)]
[Appendix (653 KB PDF)]
   
[Simulation Codebase (git repository)]
   
Errata
For the cognitive psychology environment, the samples returned were always equal to the (true) means derived from the real-world cognitive psychology dataset. The other two environments behave as described in the paper.