Projects
Building a Legal AI Application With SimpleMem, AutoResearch, and Legal Skills
May 2026
A local Legal AI application that combines document upload, SimpleMem conversation memory, legal workflow skills, and retrieval optimization for contract and legal question answering.
PDF | Code | Read more →
Recursive Failure Archaeology: RLM for Agent Failure Diagnosis
May 2026
A recursive investigation pipeline that diagnoses agent failures by segmenting long trajectories, generating hypotheses, pruning evidence, verifying controls, and comparing against full-context GPT-5.5.
Trace2Evolve: A Karpathy-Style AutoResearch Harness for Customer-Support Agents
May 2026
A research-style AutoResearch harness for improving tool-using customer-support agents using benchmark traces, heldout tau2 evaluations, synthetic pressure tests, and reliability-gated promotion.
Teaching an LLM to Explore: Reinforcement Learning for Document Navigation
March 2026
PDF | Code | Read more →
Privacy Guard: Learning Privacy-Budgeted Active Sensing Policies via Reinforcement Learning in Smart Home Environments
March 2026
PDF | Code | Read more →
Training a Deep Q-Network to Master Uno: A Comprehensive Study in Reinforcement Learning for Imperfect Information Games
January 2026