Research & Papers
ML · EvaluationI like framing questions precisely, building datasets, and designing evaluation harnesses. I've worked on visual commonsense reasoning with Prof. Ernest Davis, focusing on “visibility” and how models represent what can or cannot be seen.
- Dataset and annotation design
- Evaluation scripts across multiple models
- Error analysis and qualitative examples