Roland Andrews

Focus Period lund 2026

PhD Student

French Institute for Research in Computer Science and Automation – Inria (France)

Roland Andrews is a second-year PhD student supervised by Adrien Taylor and Justin Carpentier in the SIERRA and WILLOW teams at Inria (Paris). His PhD focuses on optimization for robotics. He has written a theoretical paper on the augmented Lagrangian method and is currently working on improving contact models in robotics simulators using diffusion-model-like approaches. The goal of this project is to reduce the sim-to-real gap for contact interactions, so that an RL policy trained in simulation and then deployed on a real robot transfers reliably (for instance for contacts such as a robot’s feet with a slippery ground, or the fingers of a robotic hand manipulating an object).

On a personal note, Roland Andrews is an avid runner (feel free to invite for a jog!). He enjoys reading about various subjects like neurobiology, sociology, psychology, and futuristic science fiction. He also follows geopolitics and world news, and loves discussing and debating these topics.

Presenting: An Exploration of Non-Euclidean Gradient Descent: Muon and its Many Variants

The recently introduced Muon optimizer has demonstrated great efficiency for training language models, though its design is a heuristic mix of steepest descent in the spectral norm with practical tricks. This talk will cover our work to develop a principled foundation for Muon, and along the way we explore various design decisions that lead to new optimization algorithms. To define a steepestdescent method over a neural network, we need to choose a norm for each layer, a way to aggregate these norms across layers, and whether to use normalization. We systematically exploredifferent alternatives for aggregating norms across layers, both formalizing existing combinations of Adam and Muon as a type of non-Euclidean gradient descent, and deriving new variants of the Muon optimizer. Through a comprehensive experimental evaluation of the optimizers within our framework, we find that Muon is sensitive to the choice of learning rate, whereas a new variant wecall MuonMax is significantly more robust. We then show how to combine any non-Euclidean gradient method with model based momentum (known as Momo). The new Momo variants of Muon areless sensitive to the choice of learning rate (and often achieve a better validation score), which greatly alleviates the cost of tuning hyperparameters.