Challenges in Non-Experimental Causal Inference with Data from Socio-Technical Systems
by: Rick Wash
Abstract
Many research questions can be answered by analyzing the decisions that humans make while using socio-technical systems. Building on Pearl’s (2009) causal graphs theory, I identify three challenges that arise when using non-experimental data from socio-technical systems to answer questions about the causal effects of human decisions. First, technical features and affordances can create an endogenous selection bias that can affect the validity of causal inference even when results are properly scoped to only be about ‘the users of the system’. Second, I highlight the problem of proxy control when using only log data to make claims about humans. And third, I re-emphasize the problem of homophily bias that arises when analyzing social network data and argue that this bias can influence a wide variety of questions beyond homophily.
Reference
Rick Wash. “Challenges in Non-Experimental Causal Inference with Data from Socio-Technical Systems” Working paper. October 2015.