Course on Dynamic A/B Testing

UofT Grad Course: Designing Intelligent Self-Improving Systems Through Human Computation, Randomized A/B Experiments and Statistical Machine Learning

CSC2558: Topics in Multidisciplinary HCI: Designing Intelligent Self-Improving Systems Through Human Computation, Randomized A/B Experiments and Statistical Machine Learning

Course enrollment, Auditing, and other details

Anyone is welcome to attend the first few classes. Auditors are welcome, especially upper-year students who want to work on projects on dynamic experimentation. Computer science students can register immediately. Students in other departments are welcome to take the course, and will likely get a spot, but ACORN will only allow enrollment after a certain date.  

Access to Materials

To get access to the course materials (slides, readings, recorded lectures) as they become available, request to join the google group "Info For Grad Course: Designing Intelligent Self-Improving Systems". The course be provided as a synchronous (and asynchronous) online course, and a wide range of materials have been put together that allow self-paced study.

Course Goals

This is a mixture of a seminar-style course with a project-based course, with a heavy emphasis on students advancing their own research, by doing projects that involve collaboration across disciplines. 

An ideal project involves the design of randomized experiments in technology, integrating perspectives from

(1) A behavioral/social science grad student (psychology, public health, medicine, education).

(2) The deployment and evaluation of software and apps led by a computer science graduate student (human-computer interaction).

(3) The application and extension of algorithms/models for analyzing data and conducting dynamic experiments, led by a grad student with a background in statistics/machine learning (statistics, computational psychology, machine learning, economics, operations research).

Students do not require a background in all three of these areas, as they will learn enough to collaborate with students who specialize in each area.

Class discussion topic: The importance of stupidity in scientific research

Bayesian inference for Thompson or posterior sampling

Course Description • Computer Science and Statistics

The course will give students an introduction to research on how to design and deploy software systems that can be deployed to real users and use data to automatically improve (see http://www.josephjaywilliams.com/research-overview). For example, building lessons that continually improve and personalise which explanations are provided to students, or building apps that motivate people to change behaviour by crowdsourcing motivational messages and use machine learning to experiment with which messages change people's decisions. 

Designing these systems draws on human-computer interaction research on crowdsourcing and human computation to generate new system actions, theories from cognitive science/psychology/public health in designing ways to measure what helps users, and algorithms from statistical machine learning & artificial intelligence to conduct experiments and analyse data in real time. 

This is a seminar style course, with a heavy emphasis on students doing research projects that are collaborative, involving the design of randomized experiments by behavioral/social scientists (psychology, public health, education), the deployment and evaluation of software and apps (human-computer interaction), and the application and extension of algorithms/models for dynamic experimentation (statistics, machine learning, operations research).

This course introduces students to computational principles for designing user-facing systems that are intelligent and continually improving, drawing on interdisciplinary work in crowdsourcing, human computation, psychology, experimental design, and statistical machine learning. Students will learn principles for enhancing and personalizing adaptive user-facing systems through randomized A/B experimental comparisons. (Examples of systems include lessons in online courses, activities for mental health, apps for encouraging exercise and other health behavior change, marketing and product design). Students will learn how to design and conduct randomized A/B experiments that are collaborative, dynamic, and personalized. 

Collaborative experiments can require combining multiple stakeholders in design: drawing on theories from social and behavioral sciences to design alternative versions of a user-facing system (e.g. educational theories about learning, psychology of goal-setting, clinical, and public health insights into cognitive behaviour therapy), the practical experience of designers, and using crowdsourcing techniques to have users themselves participate in designing improvements to be experimentally evaluated. 

Dynamic and personalized experiments require using statistics and machine learning or other artificial intelligence techniques to discover in real time which conditions are effective (on average, and for subgroups of users). These models and algorithms may include reinforcement learning, multi-armed (contextual) bandits, and Adaptive Clinical Trials. The course will combine multiple disciplines and involve collaborative work (e.g. teams of students with backgrounds in social and behavioural science, human-computer interaction and crowdsourcing, and statistics or machine learning).

Course Description • Social, Behavioural and Health Sciences

This course will introduce students to the technology skills needed to design and deploy real-world field experiments, through taking part in projects that will let them design experiments relevant to their research. Examples of experiments could include: Investigating how people learn in online courses, conducting growth mindset interventions in on-campus courses, and encouraging health behavior change through persuasive mobile apps.

Social, behavioural and health science students will also learn about statistics and machine learning techniques for dynamically analyzing data from experiments, in order to improve outcomes for future participants. They will learn to apply algorithms that analyze data to present more beneficial conditions more frequently to future participants (e.g. transition from 50/50 randomization to weighted randomization of 60/40, 80/20…). In addition, modelling of participant-treatment interactions (e.g. condition A is more effective for people with attitude X) can lead to the personalization of technology (giving different conditions to different subgroups).

Social, behavioural and health science students are not required to have any programming experience, as they will have the opportunity to collaborate with computer science students in building web software. They will also be able to learn about statistics and machine learning techniques by collaborating with grad students who specialize in these areas. In return, social, behavioural and health science students can provide computer science students with insights into how to apply social and behavioural theories to enhancing technology, and how to design rigorous and principled experiments.

The course will be seminar-style and project-based, where an ideal outcome is that a student can gain input and learn how to apply computer science techniques to a set of experiments or research questions central to their work.

Examples of relevant papers to discuss

Kizilcec, R. F., Saltarelli, A. J., Reich, J., & Cohen, G. L. (2017). Closing global achievement gaps in MOOCs. Science, 355(6322), 251-252. [PDF]

Meurer, W. J., Lewis, R. J., Tagle, D., Fetters, M. D., Legocki, L., Berry, S., … Barsan, W. G. (2012). An Overview of the Adaptive Designs Accelerating Promising Trials Into Treatments (ADAPT-IT) Project. Annals of Emergency Medicine, 60(4), 451–457. [PDF]

Scott, S. L. (2010). A modern Bayesian look at the multi‐armed bandit. Applied Stochastic Models in Business and Industry, 26(6), 639-658. [PDF]

Williams, J. J., Rafferty, A., Tingley, D., Ang, A., Lasecki, W. S., & Kim, J. (2018). Enhancing Online Problems Through Instructor-Centered Tools for Randomized Experiments. In CHI 2018, 36th Annual ACM Conference on Human Factors in Computing Systems. [PDF] [Talk Slides] [Video Figure] [Video of Talk]

Williams, J. J., Lombrozo, T., Hsu, A., Huber, B., & Kim, J. (2016). Revising Learner Misconceptions Without Feedback: Prompting for Reflection on Anomalous Facts. Proceedings of CHI (2016), 34th Annual ACM Conference on Human Factors in Computing Systems.  [PDF] [Slides] [Video of Talk ]

Williams, J. J., Kim, J., Rafferty, A., Maldonado, S., Gajos, K., Lasecki, W., & Heffernan, N. (2016). AXIS: Generating Explanations at Scale with Learnersourcing and Machine Learning. Proceedings of the Third Annual ACM Conference on Learning at Scale.  [PDF] [Slides]

Lomas, J. D., Forlizzi, J., Poonwala, N., Patel, N., Shodhan, S., Patel, K., ... & Brunskill, E. (2016, May). Interface Design Optimization as a Multi-Armed Bandit Problem. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (pp. 4142-4153). ACM.

Expanded (rougher) list of papers

Examples of weekly topics are: 

Recommended (not required) preparation is one of the following courses (few students will have multiple): human-computer interaction, psychology/economics/public health/business course on theories of behaviour/learning/decision making, experimental design, machine learning, introductory statistics.