Projects

Trace2Evolve: A Karpathy-Style AutoResearch Harness for Customer-Support Agents

May 2026

A research-style AutoResearch harness for improving tool-using customer-support agents using benchmark traces, heldout tau2 evaluations, synthetic pressure tests, and reliability-gated promotion.

Code | Read more →

Teaching an LLM to Explore: Reinforcement Learning for Document Navigation

March 2026

PDF | Code | Read more →

Privacy Guard: Learning Privacy-Budgeted Active Sensing Policies via Reinforcement Learning in Smart Home Environments

March 2026

PDF | Code | Read more →

Training a Deep Q-Network to Master Uno: A Comprehensive Study in Reinforcement Learning for Imperfect Information Games

January 2026

PDF | Read more →