Anthropic Fellows Program — AI Safety

Anthropic

London, UK; Ontario, CAN; Remote-Friendly, United States; San Francisco, CAFull-timeResearch Scientist2h ago

About this role

About Anthropic

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

Apply using this link. We are accepting applications on a rolling basis for the next cohort of Anthropic Fellows, which starts in early October. In exceptional circumstances, we can accommodate fellows starting outside the usual cohort timelines — please note in your application if the October start date doesn't work for you.

This page is specific to one of the Anthropic Fellows Workstreams, see also the main Anthropic Fellows posting.

Anthropic Fellows Program overview

The Anthropic Fellows Program is designed to foster AI research and engineering talent. We provide funding and mentorship to promising technical talent - regardless of previous experience.

Fellows will primarily use external infrastructure (e.g. open-source models, public APIs) to work on an empirical project aligned with our research priorities, with the goal of producing a public output (e.g. a paper submission). In one of our earlier cohorts, over 80% of fellows produced papers.

We run multiple cohorts of Fellows each year and review applications on a rolling basis. This application is for cohorts starting in July 2026 and beyond.

What to expect

4 months of full-time research
Direct mentorship from Anthropic researchers
Access to a shared workspace (in either Berkeley, California or London, UK)
Connection to the broader AI safety and security research community
Weekly stipend of 3,850 USD / 2,310 GBP / 4,300 CAD + benefits (these vary by country)
Funding for compute (~$15k/month) and other research expenses

Interview process

The interview process will include an initial application & reference check, technical assessments & interviews, and a research discussion.

We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you're interested in this work. We think AI systems like the ones we're building have enormous social and ethical implications. We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team.

Compensation

The expected base stipend for this role is 3,850 USD / 2,310 GBP / 4,300 CAD per week, with an expectation of 40 hours per week for 4 months (with possible extension).

Fellows workstreams

Due to the success of the Anthropic Fellows for AI Safety Research program, we are now expanding it across teams at Anthropic. We expect there to be significant overlap in the types of skills and responsibilities across the roles and will by default consider candidates for all the workstreams.

Some of the workstreams may include unique assessment steps; we therefore ask you for workstream preferences in the application. You can see an overview of the current workstreams below:

AI Safety Fellows
AI Security Fellows
ML Systems & Performance Fellows
Reinforcement Learning Fellows
Economics & Societal Impacts Fellows

This page is specific to one of the Anthropic Fellows Workstreams, see also the main Anthropic Fellows posting.

Across the workstreams, you may be a good fit if you:

Are motivated by making sure AI is safe and beneficial for society as a whole
Are excited to transition into empirical AI research and would be interested in a full-time role at Anthropic
Have a strong technical background in computer science, mathematics, or physics
Thrive in fast-paced, collaborative environments
Can implement ideas quickly and communicate clearly

Strong candidates may also have:

Strong background in a discipline relevant to a specific Fellows workstream (e.g. economics, social sciences, or cybersecurity)
Experience in areas of research or engineering related to their workstream

Candidates must be:

Fluent in Python programming
Available to work full-time on the Fellows program

AI Safety Fellows

Mentors, research areas, & past projects

Fellows will undergo a project selection & mentor matching process. Potential mentors include:

Sam Bowman
Sara Price
Alex Tamkin
Nina Panickssery
Trenton Bricken
Logan Graham
Jascha Sohl-Dickstein
Joe Benton
Collin Burns
Fabien Roger
Samuel Marks
Kyle Fish
Ethan Perez

Our mentors will lead projects in select AI safety research areas, such as:

Scalable Oversight: Developing techniques to keep highly capable models helpful and honest, even as they surpass human-level intelligence in various domains.
Adversarial Robustness and AI Control: Creating methods to ensure advanced AI systems remain safe and harmless in unfamiliar or adversarial scenarios.
Model Organisms: Creating model organisms of misalignment to improve our empirical understanding of how alignment failures might arise.
Model Internals / Mechanistic Interpretability: Advancing our understanding of the internal workings of large language models to enable more targeted interventions and safety measures.
AI Welfare: Improving our understanding of potential AI welfare and developing related evaluations and mitigations.

On our Alignment Science and Frontier Red Team blogs, you can read about past projects, including:

Subliminal Learning: Language Models Transmit Behavioral Traits via Hidden Signals in Data: Alex Cloud and Minh Le, et al., mentors including Samuel Marks and Owain Evans
Open-source circuits: Michael Hanna and Mateusz Piotrowski with mentorship from Emmanuel Ameisen and Jack Lindsey

For a full list of representative projects for each area, please see these blog posts: Introducing the Anthropic Fellows Program for AI Safety Research, Recommendations for Technical AI Safety Research Directions.

Unique candi

Apply for this role

Apply Now
Applications handled by Anthropic