<span class="rightimg"><span class="smallimg"> ![[headshot.jpg]] </span></span> I'm a graduate student at the University of Toronto focusing on [[AI Safety]] supervised by [Sheila McIlraith](https://www.cs.toronto.edu/~sheila/) and [Roger Grosse](https://www.cs.toronto.edu/~rgrosse/). I've done research on understanding transformer predictions at FAR AI, reward learning at the [Center for Human Compatible Artificial Intelligence (CHAI)](https://humancompatible.ai/)and model-based reinforcement learning (MBRL) with again with Sheila McIlraith here at U of T. I'm currently excited about developing techniques to support AI governance. Email me if this interests you too, and I'd be happy to schedule a one-on-one. While you're here, feel free to browse. It's a bit sparse at the moment, but I'm hoping to fill this place in with my thoughts and research as they evolve. > [!example] Workshop papers & Preprints > * Nora Belrose, Zach Furman, __Lev E McKinney__, Logan Smith, Danny Halawi, Igor Ostrovsky, Stella Biderman, and Jacob Steinhardt. Eliciting latent predictions from transformers with the tuned lens. arXiv preprint arXiv:2303.08112, 2023. URL https://arxiv.org/pdf/2303.08112.pdf. > * __Lev E McKinney__\*, Yawen Duan\*, David Krueger, and Adam Gleave. On the fragility of learned reward functions. In _Deep Reinforcement Learning Workshop NeurIPS 2022_, 2022. URL https://arxiv.org/abs/2301.03652 > * Keiran Paster\*, __Lev E McKinney\*__, Sheila A. McIlraith, and Jimmy Ba. BLAST: Latent dynamics models from bootstrapping. In _Deep RL Workshop NeurIPS 2021_, 2021. URL https://openreview.net/forum?id=VwA_hKnX_kR. > [!info]- Contact Info > levmckinney [at] cs [dot] toronto [dot] edu