arxiv:2310.08164
Abdullah
amirali1985
AI & ML interests
Mechanistic interpretability, high dimensional geometry, persona role playing.
Recent Activity
updated a dataset 10 days ago
PhillipsLab/axbench-steering-data published a dataset 11 days ago
PhillipsLab/axbench-steering-data updated a model 12 days ago
thoughtworks/coding-sorl