Aswin Surya

AI-safety research · UC Berkeley EECS

I work on AI safety: chain-of-thought faithfulness and monitorability, whether a model's stated reasoning reflects the computation that produces its answer.

Featured work

Cross-judge agreement breaks on saturated CoT-monitorability metrics

Two LLM judges can agree on 97% of items and still post a Cohen's κ around 0.2, which reads as almost no agreement. The standard reliability check collapses exactly where these metrics saturate: the high-monitorability regime they're built to measure. It looks like the judges disagree. They don't.

writeup & code coming soon

Research & writing

Agentic evaluation Benchmarks pushing agents on real web tasks, turning a website screenshot into working code. Berkeley AI Research (BAIR) · Dawn Song

Verifiable safety Extending data-driven Hamilton–Jacobi reachability for control systems with unknown dynamics. Berkeley AI Research (BAIR) · Claire Tomlin

A Mosquito is Worth 16×16 Larvae First-authored. Vision Transformers vs. CNNs for mosquito-larvae classification; presented at AGU. arXiv · 2022

Background

Ramp, Backend Software Engineering Intern2026
Apple, Software Engineering Intern2025
JustPaid.AI, AI Engineering Intern · YC W232025
Stanford AIMI · MIT Beaverworks, Research Intern2023
NASA / UT Austin, AI Research Intern2022–23