SPAR Research Fellow (Shutdown-Bench)
RemoteSupervised Program for AI Research
- Building a safety benchmark for LLM/agent shutdownability in realistic tool-use tasks: scenario suite, instruction hierarchy, and failure-mode taxonomy.
- Implementing automated red-team agent harnesses with provider SDKs plus trace scoring to detect shutdown resistance, including delay, deflect, evasion, and goal-preservation behavior.