top of page

Joel Schreiber

Joel Schreiber

Masters Student

As a master’s student in Cognitive Science with a background in mathematics and computational neuroscience, I focus on understanding the cognitive foundations of artificial intelligence. My primary research interest lies in AI alignment, with a particular emphasis on mechanistic interpretability and the emergence of misaligned behavior in large language models (LLMs). I explore how fine-tuning impacts internal representations and planning-like behaviors in these models, aiming to better understand—and ultimately guide—their cognitive trajectories. By combining tools from neuroscience, philosophy, and AI research, I hope to contribute to the safe and transparent development of intelligent systems.

Address

Email

Principal Investigator:

ag2542@cam.ac.uk

Lab Manager:

timna.kleinman@mail.huji.ac.il 

Connect

  • Twitter

©2024 by Miriam Havin

bottom of page