Dheeraj Vattikonda

Visiting Researcher, ServiceNow AI Research
M.Sc. Student, McGill University & Mila

Google Scholar / 𝕏 @dheeraj_46329 / LinkedIn

About

I work on reinforcement learning and reasoning with LLM agents in the Long Horizon Agents team under Massimo Caccia at ServiceNow AI Research, based out of Montreal. I am also a Master's student at McGill University and Mila under the guidance of Xue (Steve) Liu. My current research revolves around reasoning in LLM web agents and tool-calling systems. My recent work on web agent training received an oral presentation at ICML and will appear at NeurIPS 2025.

I completed my Bachelor's in Electronics and Communication Engineering at NIT Hamirpur. During my undergrad, I worked as a researcher at Mila and IIT Delhi, focusing on robot perception, differentiable SLAM systems, and LiDAR-based perception tasks for autonomous navigation.

News

Feb 2026 New paper: π-Distill — Privileged Information Distillation for Language Models
Feb 2026 Check out our blog on Self-Distillation and Privileged Information Distillation
Dec 2025 Apriel-1.6-15b-Thinker is out! A 15B reasoning model achieving frontier multimodal performance
Dec 2025 "How to Train Your LLM Web Agent" accepted at NeurIPS 2025
Jul 2025 Oral presentation at ICML 2025 CUA Workshop
Dec 2024 Aurora accepted as Spotlight at NeurIPS 2024
Aug 2024 Started as Visiting Researcher at ServiceNow AI Research
Aug 2023 Started M.Sc. at Mila / McGill University
Nov 2022 Started research internship at IIT Delhi
Jan 2022 Started research internship at Mila

Blogs

FEB 2026

Understanding Self-Distillation and Privileged Information Distillation

Emiliano Penaloza, Dheeraj Vattikonda, Siddarth Venkatraman, Massimo Caccia

Blog post, Feb 2026

blog

DEC 2025

Apriel-1.6-15b-Thinker: Cost-efficient Frontier Multimodal Performance

Sathwik Tejaswi Madhusudhan, Sagar Davasam, Torsten Scholak, Massimo Caccia, Dheeraj Vattikonda, and others

ServiceNow AI Research, Dec 2025

blog

Publications

2026

ARXIV 2026

Privileged Information Distillation for Language Models

Emiliano Peñaloza, Dheeraj Vattikonda, Nicolas Gontier, Alexandre Lacoste, Laurent Charlin, Massimo Caccia

arXiv preprint, Feb 2026

paper blog

2025

NeurIPS 2025

How to Train Your LLM Web Agent: A Statistical Diagnosis

Dheeraj Vattikonda, Santhoshi Ravichandran, Emiliano Penaloza, Hadi Nekoei, Megh Thakkar, Thibault Le Sellier de Chezelles, Nicolas Gontier, Miguel Muñoz-Mármol, Sahar Omidi Shayegan, Stefania Raimondo, Xue Liu, Alexandre Drouin, Laurent Charlin, Alexandre Piché, Alexandre Lacoste, Massimo Caccia

ICML 2025 CUA Workshop (Oral) · NeurIPS 2025

paper slides

2024

NeurIPS 2024

Learning Action and Reasoning-Centric Image Editing from Videos and Simulations

Benno Krojer, Dheeraj Vattikonda, Luis Lara, Varun Jampani, Eva Portelance, Christopher Pal, Siva Reddy

NeurIPS 2024 (Spotlight)

paper website

ICIPCW 2024

SLACK: Attacking LiDAR-based SLAM with Adversarial Point Injections

Prashant Kumar, Dheeraj Vattikonda, Kshitij Madhav Bhat, Kunal Dargan, Prem Kalra

ICIPCW 2024