# Prospective Students

We are accepting applications for PhD (and Master's) programs for Fall 2024 entry. Apply to UofT programs in Computer Science, Psychology or Statistics.

## Programs

Students can apply through Computer Science, Psychology, or Statistics (where I have cross-appointments and can supervise students). Our lab is highly interdisciplinary and currently includes students from HCI, Machine Learning, Psychology, Mental Health, Education, Economics, and Statistics – no one student is expected to have all these skills, they all teach each other!

## Research Directions

As a graduate student, you would set the agenda for research questions in collaboration with me, but illustrative examples of potential research directions are:

Developing new systems for crowdsourcing the design of online problems and lessons, using multi-stage workflows that incorporate input from students, crowd workers, instructors, and learning scientists.

Creating and evaluating tools that enable collaboration between instructors and researchers, such as co-design of interventions and personalized lessons, and coordinated analysis of data about learning outcomes for students with different characteristics.

Investigating why and when prompting students to explain text/video lectures promotes learning, and understanding the effect of multi-modal interfaces that incorporate writing, speaking, and video creation. Teaching metacognitive skills and self-regulated learning of study behaviours, taking a user-centred approach to designing social-psychological interventions for enhancing motivation such as Growth Mindset and Wise Feedback.

Enhancing student wellness and mental health by testing interventions for encouraging people to exercise, monitor stress, apply principles from Cognitive Behaviour Therapy to managing emotions. Investigating how to support online peer-to-peer interactions for having discussions around issues like managing anxiety or developing socio-emotional skills.

Interpretable and Interactive Machine Learning Systems for dynamically enhancing and personalizing instruction, especially from the perspective of combining human computation with techniques from multi-armed bandits/reinforcement learning, Bayesian optimization, and applications of deep learning to natural language processing.

Based on the alignment of interests and time, students may have opportunities to collaborate with people in U of T's Computer Science Education research group (e.g. Andrew Petersen), the Vector Institute/Artificial Intelligence/Machine Learning group (e.g. Amir Massoud Farahmand, Marzyeh Ghassemi), HCI people at DGP (e.g. Tovi Grossman, Fanny Chevalier), Psychology Department (e.g. Cendri Hutcherson, Mickey Inzlicht), the Education School OISE, and many other areas like Computational Social Science (e.g. Ashton Anderson).

## How to Apply

If you're interested, please apply to the University of Toronto Ph.D. program in Computer Science, Psychology, or Statistics, and list me as a potential advisor. Note that the Master's is a research program available to Canadian Citizens or Permanent Residents (and the MScAC is a professional master's without a research component). Due to the volume of applications, I can't promise to reply but I do make sure to look at every application that gets submitted.

To find out more about what I do, you can read my Research Statement, read one or two relevant Papers, or look at these four talks I've given to HCI, Psychology, Machine Learning, and Statistics: Talks Illustrating Examples of Lab Research. You can choose whichever is most relevant to the area you want to work in.

You can also send an email to iaiinterest@googlegroups.com with information about yourself, what relevant research experience you have, what parts of my website you've looked at and what you found interesting about them, what topics you're interested in and why, and why you want to pursue a PhD program.

#Lab Culture

If you want to learn more about the lab culture, you can get a sense by looking at the photo carousel, or the videos which some students kindly recorded for Joseph on his birthday!

Full 9 minute version and shorter prototype A and shorter prototype B some students created.

## Undergraduate Students

The IAI group recruits undergraduate students at the University of Toronto every semester. You can apply through the Computer Science ROP program, Work-Study Program, or email us directly if you're interested in volunteering.

Regardless of the application channel you use, you should also:

Read my Research Statement

Read one or two relevant Papers

Look at one of the four talks I've given to HCI, Psychology, Machine Learning, and Statistics: Talks Illustrating Examples of Lab Research.

Send an email to iaiinterest@googlegroups.com with information about yourself, any relevant research experience you have, what topics interest you, which of the resources listed above you've looked at, and what you found interesting about them.

## Talks Illustrating Examples of Lab Research

### HCI (Human-Computer Interaction) targeted talk (UWashington DUB/HCI series)

Slides: tiny.cc/iaislides Recording: tiny.cc/iairecording

Short Title: Enhancing and Personalizing Technology Through Dynamic Experimentation

Long Title: Perpetually Enhancing and Personalizing Technology for Learning & Health Behavior Change: Using Randomized A/B Experiments to integrate Human-Computer Interaction, Psychology, Crowdsourcing & Statistical Machine Learning

How can we transform the everyday technology people use into intelligent, self-improving systems? Our group applies statistical machine learning algorithms to analyze randomized A/B experiments and give the most effective conditions to future users. Ongoing work includes comparing different explanations for concepts in digital lessons/problems, getting people to exercise by testing motivational text messages, and discovering how to personalize micro-interventions to reduce stress and improve mental health. One example system crowdsourced explanations for how to solve math problems from students and teachers, and conducted an A/B experiment to identify which explanations other students rated as being helpful. We used algorithms for multi-armed bandits that analyze data in order to estimate the probability that each explanation is the best, and adaptively weight randomization to present better explanations to future learners (LAS 2016, CHI 2018). This generated explanations that helped learning as much as those of a real instructor. Ongoing work aims to personalize, by discovering which conditions are effective for subgroups of users. We use randomized A/B experiments in technology as an engine for practical improvement, in tandem with advancing research in HCI, psychological theory, statistics, and machine learning.

### Machine Learning targeted talk (Vector, MILA-McGill/Stanford, Microsoft Research New York)

Vector Institute for Artificial Intelligence Jan 2021:

Recording: 2021-01-22 WILLIAMS Joseph Jay Edited.mp4

Short Title: Adapting Real-World Experimentation To Balance Enhancement of User Experiences with Statistically Robust Scientific Discovery

Long Title: Perpetually Enhancing User Interfaces in Tandem with Advancing Scientific Research in Education & Mental Health: Enabling Reliable Statistical Analysis of the Data Collected by Algorithms that Trade Off Exploration & Exploitation

How can we transform the everyday technology people use into intelligent, self-improving systems? For example, how can we perpetually enhance text messages for managing stress, or personalize explanations in online courses? Our work explores the use of randomized adaptive experiments that test alternative actions (e.g. text messages, explanations), aiming to gain greater statistical confidence about the value of actions, in tandem with rapidly using this data to give better actions to future users.

To help characterize the problems that arise in statistical analysis of data collected while trading off exploration and exploitation, we present a real-world case study of applying the multi-armed bandit algorithm TS (Thompson Sampling) to adaptive experiments. TS aims to assign people to actions in proportion to the probability those actions are optimal. We present empirical results on how the reliability of statistical analysis is impacted by Thompson Sampling, compared to a traditional experiment using uniform random assignment. This helps characterize a substantial problem to be solved – using a reward maximizing algorithm can cause substantial issues in statistical analysis of the data. More precisely, an adaptive algorithm can increase both false positives (believing actions have different effects when they do not) and false negatives (failing to detect differences between actions). We show how statistical analyses can be modified to take into account properties of the algorithm, but that these do not fully address the problem raised.

We therefore introduce an algorithm that assigns a proportion of participants uniformly randomly and the remaining participants via Thompson sampling. The probability that a participant is assigned using Uniform Random (UR) allocation is set to the posterior probability that the difference between two arms is 'small' (below a certain threshold), allowing for more UR exploration when there is little or no reward to be gained by exploiting. The resulting data can enable more accurate statistical inferences from hypothesis testing by detecting small effects when they exist (reducing false negatives), and reducing false positives.

The work we present aims to surface the underappreciated complexity of using adaptive experimentation to both enable scientific/statistical discovery and help real-world users The current work takes a first step towards computationally characterizing some of the problems that arise, and what potential solutions might look like, in order to inform and invite multidisciplinary collaboration between researchers in machine learning, statistics, and the social-behavioral sciences.

### Psychology targeted talk (Social-Personality & Cognitive Psych at U of T)

U of T Social Psychology Research Group talk Sep 2019:

Slides: SPRG Talk v5.pptx and Recording: SPRG Talk (rough, unedited).mp4.

Title: Conducting Adaptive Field Experiments that Enhance and Personalize Education and Health Technology

Understanding people’s complex real-world thinking is a challenge for psychology, while human-computer interaction aims to build computational systems that can behave intelligently in the real-world. This talk presents a framework for redesigning the everyday websites people interact with to function as: (1) Micro-laboratories for psychological experimentation and data collection, (2) Intelligent adaptive agents that implement machine learning algorithms to dynamically discover how to optimize and personalize people’s learning and reasoning. I present an example of how this framework is used to embed randomized experiments into-real world online educational contexts – like learning to solve math problems– and machine learning used for automated experimentation. Explanations (and experimental conditions) are crowdsourced from teachers and scientists, and reinforcement learning algorithms for multi-armed bandits used in real-time to discover the best explanations for optimizing learning. Ongoing research examines tools for managing stress by applying approaches like cognitive behavior therapy, helping university students self-regulate and plan, and encouraging health behavior change through a collaboration with Goodlife to get people to go to the gym.

### Statistics targeted talk (Cambridge, UMichigan, Columbia)

Title: Adapting Real-World Experimentation To Balance Enhancement of User Experiences with Statistically Robust Scientific Discovery

Cambridge University May 2021

Slides: tiny.cc/jstatstalk

Recording: https://www.youtube.com/watch?v=vV1ShFophpM

How can we transform the everyday technology people use into intelligent, self-improving systems? For example, how can we perpetually enhance text messages for managing stress, or personalize explanations in online courses? Our work explores the use of randomized adaptive experiments that test alternative actions (e.g. text messages, explanations), aiming to gain greater statistical confidence about the value of actions, in tandem with rapidly using this data to give better actions to future users.

To help characterize the problems that arise in statistical analysis of data collected while trading off exploration and exploitation, we present a real-world case study of applying the multi-armed bandit algorithm TS (Thompson Sampling) to adaptive experiments, providing more empirical context to issues raised by past work on adaptive clinical trials. TS aims to assign people to actions in proportion to the probability those actions are optimal. The empirical results help characterize how a reward maximizing algorithm can increase both false positives (believing actions have different effects when they do not) and false negatives (failing to detect differences between actions). We explore two methods that take into account properties of the TS algorithm used to collect data: inverse-probability weighting and an 'algorithm-induced hypothesis test' that uses non-parametric simulations under the null. These help but do not fully address the problems raised.

We therefore introduce an algorithm which assigns a proportion of participants uniformly randomly and the remaining participants via Thompson sampling. The probability that a participant is assigned using Uniform Random (UR) allocation is set to the posterior probability that the difference between two arms is 'small' (below a certain threshold), allowing for more UR exploration when there is little or no reward to be gained by exploiting. The resulting data can enable more accurate statistical inferences from hypothesis testing by detecting small effects when they exist (reducing false negatives), and reducing false positives.

The work we present aims to surface the underappreciated complexity of using adaptive experimentation to both enable scientific/statistical discovery and help real-world users. We conduct field deployments and provide software the community can use to evaluate statistical tests and algorithms in complex real-world applications. This helps provide first steps towards integrating two key approaches: (1) How can we modify statistical tests to better match the properties of the algorithms that collect data in adaptive experiments? (2) How can we modify algorithms for adaptive experimentation to be more "statistically considerate" in being better suited to inference and analysis of data, while maximizing chances of giving participants useful arms? Tackling these questions requires multidisciplinary collaboration between researchers in machine learning, statistics, and the social-behavioral sciences. This is joint work with Nina Deliu, Sofia Villar, Audrey Durand, Anna Rafferty, and others.