I'm an ML engineer at Amazon Prime Video building LLM-powered systems that serve hundreds of millions of users at multi-region scale. The day job is production infrastructure — pipelines, agents, reliability engineering — on systems where a single regression ripples out to tens of millions of streams.
Outside of Amazon, I spend my evenings on open-source research-adjacent work: reproducing Anthropic's induction-heads result on GPT-2, building safety eval suites with calibrated judges, measuring coding agents with real failure-mode analysis. I try to write code that admits its failure modes. Every repo ships with committed run artifacts so you can see the headline claim before installing. The mech-interp repo has an explicit regression guard for the off-by-one I caught in my own v0.1 — if someone ever reverts the fix, it fails loudly.
Currently interviewing for research-engineer and frontier-engineer roles at AI labs. Based in New York, open to relocation. STEM OPT, H-1B cap-exempt eligible.