Representative Talks

HCI (Human-Computer Interaction) Targeted Talk

Slides: tiny.cc/iaislides 

Recording: tiny.cc/iairecording

Short Title: Enhancing and Personalizing Technology Through Dynamic Experimentation

Long Title: Perpetually Enhancing and Personalizing Technology for Learning & Health Behavior Change: Using Randomized A/B Experiments to Integrate Human-Computer Interaction, Psychology, Crowdsourcing & Statistical Machine Learning

How can we transform the everyday technology people use into intelligent, self-improving systems? Our group applies statistical machine learning algorithms to analyze randomized A/B experiments and give the most effective conditions to future users. Ongoing work includes comparing different explanations for concepts in digital lessons/problems, getting people to exercise by testing motivational text messages, and discovering how to personalize micro-interventions to reduce stress and improve mental health. One example system crowdsourced explanations for how to solve math problems from students and teachers, and conducted an A/B experiment to identify which explanations other students rated as being helpful. We used algorithms for multi-armed bandits that analyze data in order to estimate the probability that each explanation is the best, and adaptively weight randomization to present better explanations to future learners (LAS 2016, CHI 2018). This generated explanations that helped learning as much as those of a real instructor. Ongoing work aims to personalize, by discovering which conditions are effective for subgroups of users. We use randomized A/B experiments in technology as an engine for practical improvement, in tandem with advancing research in HCI, psychological theory, statistics, and machine learning.


Data Science Targeted Talk

Title: Adapting Real-World Experimentation To Balance Enhancement of User Experiences with Statistically Robust Scientific Discovery

Abstract: How can we transform the everyday technology

people use into intelligent, self-improving systems? For

example, how can we perpetually enhance text messages

for managing stress, or personalize explanations in online

courses? Our work explores the use of randomized

adaptive experiments that test alternative actions (e.g.

text messages, explanations), aiming to gain greater

statistical confidence about the value of actions, in

tandem with rapidly using this data to give better actions

to future users.

Machine Learning Targeted Talk

Slides: Adapting Real-World Experimentation To Balance Enhancement of User Experiences with Statistically Robust Scientific Discovery.pptx

Recording: 2021-01-22 Adapting Real-World Experimentation.mp4

Short Title: Adapting Real-World Experimentation To Balance Enhancement of User Experiences with Statistically Robust Scientific Discovery

Long Title: Perpetually Enhancing User Interfaces in Tandem with Advancing Scientific Research in Education & Mental Health: Enabling Reliable Statistical Analysis of the Data Collected by Algorithms that Trade Off Exploration & Exploitation

How can we transform the everyday technology people use into intelligent, self-improving systems? For example, how can we perpetually enhance text messages for managing stress, or personalize explanations in online courses? Our work explores the use of randomized adaptive experiments that test alternative actions (e.g. text messages, explanations), aiming to gain greater statistical confidence about the value of actions, in tandem with rapidly using this data to give better actions to future users. 

To help characterize the problems that arise in statistical analysis of data collected while trading off exploration and exploitation, we present a real-world case study of applying the multi-armed bandit algorithm TS (Thompson Sampling) to adaptive experiments. TS aims to assign people to actions in proportion to the probability those actions are optimal. We present empirical results on how the reliability of statistical analysis is impacted by Thompson Sampling, compared to a traditional experiment using uniform random assignment. This helps characterize a substantial problem to be solved – using a reward maximizing algorithm can cause substantial issues in statistical analysis of the data. More precisely, an adaptive algorithm can increase both false positives (believing actions have different effects when they do not) and false negatives (failing to detect differences between actions). We show how statistical analyses can be modified to take into account properties of the algorithm, but that these do not fully address the problem raised.

We therefore introduce an algorithm that assigns a proportion of participants uniformly randomly and the remaining participants via Thompson sampling. The probability that a participant is assigned using Uniform Random (UR) allocation is set to the posterior probability that the difference between two arms is 'small' (below a certain threshold), allowing for more UR exploration when there is little or no reward to be gained by exploiting. The resulting data can enable more accurate statistical inferences from hypothesis testing by detecting small effects when they exist (reducing false negatives), and reducing false positives.

The work we present aims to surface the underappreciated complexity of using adaptive experimentation to both enable scientific/statistical discovery and help real-world users The current work takes a first step towards computationally characterizing some of the problems that arise, and what potential solutions might look like, in order to inform and invite multidisciplinary collaboration between researchers in machine learning, statistics, and the social-behavioral sciences.

Psychology Targeted Talk

Slides: Social Psychology Research Group Talk v5.pptx 

Recording: Social Psychology Research Group Talk.mp4.

Title: Conducting Adaptive Field Experiments that Enhance and Personalize Education and Health Technology

Understanding people’s complex real-world thinking is a challenge for psychology, while human-computer interaction aims to build computational systems that can behave intelligently in the real-world. This talk presents a framework for redesigning the everyday websites people interact with to function as: (1) Micro-laboratories for psychological experimentation and data collection, (2) Intelligent adaptive agents that implement machine learning algorithms to dynamically discover how to optimize and personalize people’s learning and reasoning. I present an example of how this framework is used to embed randomized experiments into-real world online educational contexts – like learning to solve math problems– and machine learning used for automated experimentation. Explanations (and experimental conditions) are crowdsourced from teachers and scientists, and reinforcement learning algorithms for multi-armed bandits used in real-time to discover the best explanations for optimizing learning. Ongoing research examines tools for managing stress by applying approaches like cognitive behavior therapy, helping university students self-regulate and plan, and encouraging health behavior change through a collaboration with Goodlife to get people to go to the gym.

Statistics Targeted Talk

Title: Adapting Real-World Experimentation To Balance Enhancement of User Experiences with Statistically Robust Scientific Discovery

Slides: tiny.cc/jstatstalk

Recording: https://www.youtube.com/watch?v=vV1ShFophpM 

How can we transform the everyday technology people use into intelligent, self-improving systems? For example, how can we perpetually enhance text messages for managing stress, or personalize explanations in online courses? Our work explores the use of randomized adaptive experiments that test alternative actions (e.g. text messages, explanations), aiming to gain greater statistical confidence about the value of actions, in tandem with rapidly using this data to give better actions to future users.

To help characterize the problems that arise in statistical analysis of data collected while trading off exploration and exploitation, we present a real-world case study of applying the multi-armed bandit algorithm TS (Thompson Sampling) to adaptive experiments, providing more empirical context to issues raised by past work on adaptive clinical trials. TS aims to assign people to actions in proportion to the probability those actions are optimal. The empirical results help characterize how a reward maximizing algorithm can increase both false positives (believing actions have different effects when they do not) and false negatives (failing to detect differences between actions). We explore two methods that take into account properties of the TS algorithm used to collect data: inverse-probability weighting and an 'algorithm-induced hypothesis test' that uses non-parametric simulations under the null. These help but do not fully address the problems raised.

We therefore introduce an algorithm which assigns a proportion of participants uniformly randomly and the remaining participants via Thompson sampling. The probability that a participant is assigned using Uniform Random (UR) allocation is set to the posterior probability that the difference between two arms is 'small' (below a certain threshold), allowing for more UR exploration when there is little or no reward to be gained by exploiting. The resulting data can enable more accurate statistical inferences from hypothesis testing by detecting small effects when they exist (reducing false negatives), and reducing false positives.

The work we present aims to surface the underappreciated complexity of using adaptive experimentation to both enable scientific/statistical discovery and help real-world users. We conduct field deployments and provide software the community can use to evaluate statistical tests and algorithms in complex real-world applications. This helps provide first steps towards integrating two key approaches: (1) How can we modify statistical tests to better match the properties of the algorithms that collect data in adaptive experiments? (2) How can we modify algorithms for adaptive experimentation to be more "statistically considerate" in being better suited to inference and analysis of data, while maximizing chances of giving participants useful arms? Tackling these questions requires multidisciplinary collaboration between researchers in machine learning, statistics, and the social-behavioral sciences. This is joint work with Nina Deliu, Sofia Villar, Audrey Durand, Anna Rafferty, and others.

Talks Before 2013

"Excellent Online Education" (EdX)

Abstract: The expanding world of online learning has provided environments that compete along many dimensions, but one is paramount: The depth, speed, and longevity of student learning. Despite the breakneck pace of online development, the challenge of producing excellent education can be most effectively met by building on what is already known: incorporating the best scientific research on learning, and surveying outstanding educational practice. Examples of evidence-based principles for improving learning in an online course are presented in the context of the EdX platform: Teaching for Transfer, Teaching Incremental Theories of Intelligence, Problem-Based Learning, Case-based reasoning, Use of Analogies, Comparison of Examples, Retrieval Practice, and Prompting for Explanations. By building a foundation on current research and making experiments (A/B testing) a ubiquitous feature of everyday instruction, I present a Cognitive Science perspective on how online courses can become the dominant source of education research and innovation.

"How to Use Scientific Research to Improve Khan Academy Exercises, and How to Use Khan Academy Exercises to Improve Scientific Research." (Khan Academy)

Abstract: How should we design Khan Academy exercises to produce excellent student learning? How should problems be presented and solutions be communicated, and can we make them more interactive? Cognitive Science research has three insights to answer these and many other practical questions. (1) Avoid reinventing the wheel – review & apply the scientific insights from thousands of previous studies. I will present on Using Cognitive Science Research on Learning to Improve Education. (2) Solve practical problems like a scientist – conduct experiments to evaluate learning whenever possible. I will discuss proposed experiments. (3) "Good designers copy, great designers steal". I report on my exploration of numerous examples of different approaches to the "art" of developing exercises, mining them for insights to synthesize and combine. In turn, engaging in this attempt to use science to improve the practical problem of learning exercises can improve science: By evaluating theories in a real-world environment, disseminating scientific insights in explicit "product form" as thoughtfully crafted exercises, and by unifying knowledge across topics and disciplines around a shared practical problem.

"Using Cognitive Science to Improve E-learning" (Bloomsburg Corporate Advisory Council Talk)

SlideShare Presentation (directed at instructional designers and e-learning professionals who focus on industry and businesses.)

"How can Cognitive Science improve Online Learning at Google and Google in Education?" (Google Tech Talks)

Presentation on Slideshare

Youtube Video

Abstract: Knowledge and technology that maximizes human learning has financial value for Google in customer education and internal training, as well as social value for the public initiatives of Google in Education. Recent research in Cognitive Science provides complementary insights to those gained from practical experience and the research in Computer Science, Education and other Learning Sciences. This talk considers how learning can be improved by: (1) Asking questions and requesting explanations; (2) Presenting specific examples to illustrate abstract principles; (3) Using tests as pedagogical rather than assessment tools. Moreover, online education provides the unique opportunity of hybrid research that is simultaneously applied and academic. Online environments satisfy the scientific requirements of randomized experiments and precise control, as well as the practical need for ecological validity, fidelity, and scalable dissemination. The Cognitive Science focus on identifying both similarities and differences across learning contexts positions it well for doing research that simultaneously advances public education and a corporate mission. In addition to presenting ongoing research at Khan Academy and MOOCs like EdX, I discuss how analogous principles can be explored in teaching end-users Google Power Search, internal training, and customer education. 

"Using Research to Maximize the Practical Impact of Online Educational Resources" (UC Berkeley Graduate School of Education)

Abstract: From K-12 students to adults in the workforce, people are increasingly learning from online educational resources like videos, lessons, and interactive exercises. This has made the benefits and costs of online learning a source of lively debate. This talk presents a framework for creating and improving online educational resources that: (1) Complements the real-world instructional needs of teachers and students by providing instantly accessible and constantly improving online lessons and exercises; (2) Uses technology/software that allows rapid authoring & revision of resources by instructors and researchers; (3) Supports "in vivo" experiments that compare different instructional methods; (4) Collects practical and scientifically validated measures of learning; (5) Is guided by theory and scientific evidence from the cognitive and learning sciences.

This framework is illustrated using ongoing research that: (a) Boosts community college students' grades through brief online lessons that teach them that their intelligence is malleable, (b) Uses laboratory experiments to iteratively develop online lessons to teach general learning strategies to K-12 and undergraduate students; (c) Promotes learning on KhanAcademy.org by prompting students to generate explanations and answer conceptual questions about the mathematics exercises they are solving.

"Supporting Instructors in MOOCS: Using Cognitive Science Research to Guide Pedagogy and Instructional Design" (EdX/MIT (Instructor Support))

Abstract: How can online learning platforms provide useful information about pedagogy to instructors teaching online, while ensuring that course teams are not constrained in leveraging their teaching expertise to personalize their MOOC? The scientific literature on learning and education provides hundreds of detailed studies, which can be synthesized to identify effective instructional strategies, and mined for examples of how an instructional strategy can be implemented in a specific environment, set of educational materials, or student population. This talk illustrates this approach, by presenting a worksheet guide that supports MOOC designers in using two instructional strategies: increasing student motivation to think through challenges by designing exercises which encourage students to see their intelligence as malleable, and enhancing deep understanding with questions and prompts for students to explain. The talk explains how these two instructional strategies are motivated by both existing literature and recently conducted experimental studies. It also presents the specific details of how the guide is targeted at MOOC instructors and provides them with multiple actionable strategies they can use in their courses.

SlideShare

"How Can Online and Blended Education Be Used to Close the Loop Between Research and Practice?" (Stanford Lytics Lab)

Abstract: The development of online and blended learning offers a variety of new opportunities. This talk considers one approach to using online educational resources to close the loop between research on learning and the practical delivery of online & blended education, motivated by the interdisciplinary field of cognitive science. Not only are online educational resources (e.g. videos, mathematics exercises) ecologically valid, scalable, and delivered instantaneously, but they also provide a laboratory environment that supports randomized assignment to instructional conditions, experimental control, and automatic collection of dependent measures. I discuss how my research – on how generating explanations promotes learning – attempts to close the research-practice loop by going back and forth between: reviewing primary literature, synthesizing practical implications, conducting basic experimental psychology research, online studies that use educational materials with convenience samples from Mechanical Turk, and embedded "in vivo" experiments to improve learning from Khan Academy's mathematics exercises.

I also describe the collaborative approach that has worked well in this context: involving researchers (from psychology, education, computer science) as well as practitioners (teachers, students, designers), and using the internet to support diverse, multi-site collaboration. I close with a brief overview of the resources I have found especially valuable for connecting research & practice in online & blended education: research literature & reviews in the cognitive and learning sciences, online platforms and ed-tech products that are especially well-suited for research, funding & grant opportunities, and specific ways to facilitate future researcher-practitioner collaborations.

"Using Online Experiments to Enhance Educational Research & Practice" (Pittsburgh Science of Learning Center Summer School, at Carnegie Mellon)

Slideshare

"How Online Educational Resources and the Internet Can Facilitate Diverse Research-Practice and Cross-Disciplinary Collaborations" (Panel at AERA)

(Invited Panel Presentation at AERA 2014 SIG Computers & Internet Applications in Education)

Presenters: Joseph Jay Williams, Ann Edwards & Maria Mendiburo, Anne Trumbore, Marcia Linn, Hui Soo Chae & Gary Natriello, Piotr Mitros, George Siemens

Abstract: Increasing use of online educational resources and the Internet bring students together around shared resources. Can a similar benefit be found for educators and researchers? This invited panel considers this question through presentations which explain how collaborative work in both research and practice was supported by the affordances of the Internet or the affordances provided by working on online educational resources. As educational materials (like sequences of lessons or interactive exercises) become digital resources on the Internet, the reduction in space and time constraints makes them more widely available for students. Similarly, researchers working to improve and understand online resources can use the Internet to collaborate and consult with other researchers and practitioners who span a diverse range of disciplines, perspectives, geographic locations, and schedules.

Document with agenda and shared notes for the event

"Improving Online Educational Resources Using Cognitive Science (and Online Collaborations between Cognitive Scientists and Educators)" (iNACOL Webinar Series)

Description: This webinar provides practical information on how to use published research findings and make contact with cognitive scientists in order to improve K-12 and university students’ learning from digital online resources, like Khan Academy videos or interactive mathematics exercises. The webinar focuses on how students’ motivation and grades have been increased by helping them believe they can take charge of their learning and become smarter, and how students can be supported in reflective thinking and seeking deep understanding, when questions and prompts for students to explain are inserted in videos and interactive exercises. Links to further reading and implementation guides will be provided, from a curated collection of practical but scientifically based principles (e.g. www.josephjaywilliams.com/education or tiny.cc/improveonlinelearning). 

The webinar will also consider how research and practice can be more closely linked by the transition from spoken lectures and pen-and-paper assignments to digital online videos and exercises. Online Educational Resources create a “Real-World Laboratory” where scientists can measure learning outcomes for hundreds of students and do experiments that directly compare the benefits of different versions of online videos and exercises that use different instructional methods. Educators can also advise scientists on doing more practical research, by sharing their understanding and wisdom about the challenges students face in learning from a particular set of videos and exercises. While conversations and collaborations between educators and researchers can be difficult to coordinate, the webinar explains simple systems for sharing and commenting on a set of online resources, and strategies for connecting researchers and educators interested in the same educational challenges. Please feel free to post questions and thoughts at tiny.cc/inacolwebinar before, during and after the webinar.

Powerpoint Slides

Blackboard Presentation

"How Online Educational Resources Provide Novel Affordances for Conducting Practical Interventions and Doing Psychology Experiments" (Stanford Psychological Interventions in Educational Settings (PIES) Group)

SlideShare

"Doing Online Learning Research With Both Scientific and Financial Value" (Stanford Lytics Lab)

SlideShare with audio

"How to Use Online Resources to Facilitate Collaboration Across Disciplines" (National Science Foundation Seventh Annual Inter-Science of Learning Centers Conference)

Presenters: Joseph Jay Williams and Cristina Zepeda

Context: Workshop presented at iSLC 2014

"Embedding Experiments in Online Educational Resources to Understand and Improve Learning" (Arizona State University)

Abstract: A series of experiments is presented which leverages digital online educational resources (like mathematics exercises on Khan Academy or videos in MOOCs/Massive Open Online Courses) to build from lab-based psychological research on learning and motivation to randomized experiments embedded in real-world educational products. This work comprises the complementary threads of (1) understanding how cognitive processing (engaged by generating explanations) underlies learning (through generating explanations), and (2) understanding how beliefs (about the malleability of intelligence) influence motivation.

The first line of research investigates how people learn through generating explanations (concerning domains from artificial categories to statistics), providing evidence for the novel “Subsumptive Constraints” account of why explaining “why?” helps learning – by driving a search for underlying patterns and principles. Two ongoing experiments embedded in Khan Academy math exercises and a biology MOOC extend this work to prompting people to generate explanations while studying worked examples and videos. 

The second line of research finds that students’ motivation is increased through the minimal addition of messages about the malleability of intelligence – they attempt and solve more math problems – although there is no such effect of encouraging messages that do not emphasize intelligence’s malleability. 

Since digital educational resources bring students’ authentic learning into a medium with the affordances of a laboratory – randomized assignment, experimental control, and data collection – they provide a novel opportunity to bridge research and practice across multiple disciplines. 

"Scientifically Guided Benchmarking: Using Research to Identify Practical Improvements to Modular Educational Resources" (Coursera)

Information: tiny.cc/courseratalk

Abstract: How can existing literature and research methodologies be used practically to support deciding between the many options faced by instructors and engineers? This talk presents one approach, in which an interdisciplinary scientific knowledge base about learning is combined with benchmarking or adapting best practices from examples of educational technology. This involves a dual focus on selectively synthesizing existing research and combining insights from multiple practical technologies, which can support the identification and implementation of the high quality pedagogical principles most appropriate for a target context/platform.

This approach is illustrated with randomized experiments or A/B tests in Khan Academy’s mathematics exercises that use messages to increase motivation and teach learning & problem-solving strategies, in-progress studies to boost motivation and increase learning from MOOC videos, and examples of interactive online tools to help people practice strategies for learning in MOOCs, and apply concepts from a management workshop to everyday interactions.

Based on this work, ideas about potential directions for MOOCs & Online Learning with practical, financial, and scientific value are presented for discussion and feedback: The value of focusing offerings, development & research on modular online educational resources or “MOOClets” that are at a smaller grain size than full courses (e.g. lessons, videos, assignments); facilitating iterative revision & improvement of content & interactive exercises; supporting rapid collaborative feedback on online resources; and providing technology support for students to learn generalizable skills along with content – like strategies for learning online, solving problems, and effective interpersonal collaboration and management.

"Experiment-Focused Instructional Design: Using A/B Testing in Blended Learning Resources as Both a Conceptual and Empirical Tool" (Stanford Instructional Design Special Interest Group)

Information: tiny.cc/idtalk

Abstract: A powerful feature of educational resources that are digital and cloud-based is that instructional designers can iteratively improve them, using the A/B testing affordances of software to explore how students' learning can be helped by changes to a resource. This talk considers the value of Experiment-Focused Design as a strategy for bringing the instructional and pedagogical expertise of instructors and designers to bear in specific contexts. Experiment-Focused Design refers to conceptually and/or empirically evaluating design decisions (like how to create text and video lessons, homework assignments, exercises for peer collaboration) in terms of how randomly assigning students to alternative versions of a resource would impact particular measures of student learning.

The presentation explains how designers can easily use software like Qualtrics to do qualitative and quantitative experimentation, in settings from small in-class groups to MOOCs to convenience samples available on Amazon Mechanical Turk. Examples available for illustration include: changing text-based lessons, interactive online mathematics exercises (e.g. Khan Academy), designing activities to structure peer discussion, digital tools to support study strategies, problem-solving skills, and application of concepts from a management course to real-life. The examples presented can also be tailored to the interest of Stanford IDs –please provide suggestions or requests at tiny.cc/idinput or to josephjaywilliams@stanford.edu 

"Learning Engineering of MOOClets: Simultaneously benefiting Professional Learning, Financial Success, and Cognitive & Learning Sciences Research" (Declara)

Title: Learning Engineering of MOOClets: Simultaneously benefiting Professional Learning, Financial Success, and Cognitive & Learning Sciences Research 

Context: Invited talk at Declara 

SlideShare

"Using Modular Design & “MOOClets” to connect Learning to Real-World Tasks and promote Generalizable Skills" (Udacity)

Title: Using Modular Design & “MOOClets” to connect Learning to Real-World Tasks and promote Generalizable Skills

Context: Invited talk at Udacity

"MOOClet-Driven Research & Development: Leveraging Iterative, Collaborative Development of Modules to align Course Production & Research" (HarvardX)

Title: MOOClet-Driven Research & Development: Leveraging Iterative, Collaborative Development of Modules to align Course Production & Research

Context: Invited talk at HarvardX

"Using technology to bridge cognitive research with practical impact on motivation and learning" (CMU)

Title: Using technology to bridge cognitive research with practical impact on motivation and learning

Context: Invited talk at Carnegie Mellon University

"Enhancing Education Research and Practice Using Qualtrics" (Qualtrics)

Title: Enhancing Education Research and Practice Using Qualtrics

Context: Talk at the 2015 Qualtrics Insight Summit

"MOOClets: A Framework for real-time Adaptive Personalization using A/B Experiments" (McGraw Hill Meetup)

Title: MOOClets: A Framework for real-time Adaptive Personalization using A/B Experiments

Context: Talk at the McGraw Hill Education Meetup on Predictive Analytics for Education.

"The MOOClet Formalism & API: Enabling active machine learning to modify and personalize user-facing software components" (MIT)

Title: The MOOClet Formalism & API: Enabling active machine learning to modify and personalize user-facing software components

Context: Talk at Josh Tenenbaum's Computational Cognitive Science Group.