Stephen McAleer

Postdoc at Carnegie Mellon University

Google Scholar | CV | Twitter

I am broadly interested in algorithms for robust sequential decision-making. My long-term goal is to develop an agent that can accomplish any task that a human can perform on a computer.

I am currently a postdoc at CMU working with Tuomas Sandholm. I received my PhD in computer science from the University of California, Irvine working with Pierre Baldi. During my PhD, I did research scientist internships at Intel Labs and DeepMind. Before that, I received my bachelor's degree in mathematics and economics from Arizona State University in 2017.

Representative Papers

Language Models can Solve Computer Tasks
Geunwoo Kim, Pierre Baldi, Stephen McAleer
ArXiv 2023
Paper | Code

ESCHER: Eschewing Importance Sampling in Games by Computing a History Value Function to Estimate Regret
Stephen McAleer, Gabriele Farina, Marc Lanctot, Tuomas Sandholm
International Conference on Learning Representations (ICLR) 2023
Paper | Code

Mastering the Game of Stratego With Model-Free Multiagent Reinforcement Learning
Julien Perolat*, Bart de Vylder*, Daniel Hennes, Eugene Tarassov, Florian Strub, Vincent de Boer, Paul Muller, Jerome T Connor, Neil Burch, Thomas Anthony, Stephen McAleer, Romuald Elie, Sarah H Cen, Zhe Wang, Audrunas Gruslys, Aleksandra Malysheva, Mina Khan, Sherjil Ozair, Finbarr Timbers, Toby Pohlen, Tom Eccles, Mark Rowland, Marc Lanctot, Jean-Baptiste Lespiau, Bilal Piot, Shayegan Omidshafiei, Edward Lockhart, Laurent Sifre, Nathalie Beauguerlange, Remi Munos, David Silver, Satinder Singh, Demis Hassabis, Karl Tuyls*
Science 2022

XDO: A Double Oracle Algorithm for Extensive-Form Games
Stephen McAleer, John Lanier, Kevin A Wang, Pierre Baldi, Roy Fox
Conference on Neural Information Processing Systems (NeurIPS) 2021
Paper | Code

Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games
Stephen McAleer*, John Lanier*, Roy Fox, Pierre Baldi
Conference on Neural Information Processing Systems (NeurIPS) 2020
Paper | Code

Solving the Rubik's Cube With Approximate Policy Iteration
Stephen McAleer*, Forest Agostinelli*, Alexander Shmakov*, Pierre Baldi
International Conference on Learning Representations (ICLR) 2018

Selected Press

Popular Science: Here's how a new AI mastered the tricky game of Stratego.

TechCrunch: Now AI can outmaneuver you at both Stratego and Diplomacy.

Gizmodo: DeepMind's New AI Uses Game Theory to Trounce Humans in 'Stratego'.

MIT Technology Review: A machine has figured out Rubik's Cube all by itself.

Washington Post: How quickly can AI solve a Rubik's Cube? In less time than it took you to read this headline.

LA Times: A machine taught itself to solve Rubik's Cube without human help, UC Irvine researchers say.

BBC: AI Solves Rubik's Cube in One Second.