About the job
Join Sanity, where we are pioneering the future of AI-driven Content Operations. Our innovative AI Content Operating System empowers teams to model, create, and automate content in alignment with their unique business processes, enhancing digital development and dramatically improving content operation efficiency. Industry leaders such as Linear, Figma, Riot Games, Anthropic, Spotify, Arc’teryx, and Morningbrew trust Sanity to automate their content workflows.
As a Senior Software Engineer focused on AI Growth, you will be pivotal in developing the complete vertical for AI-native applications at Sanity. This includes advancing our Model Context Protocol (MCP) server and delivering agentic systems that can scaffold Sanity applications, Studios, and comprehensive front-end solutions. In collaboration with designers, product managers, and fellow engineers, you will engage in rapid experimentation, transforming the latest AI capabilities into tangible products. If you thrive on innovation, enjoy late-night tinkering with new model releases, and envision possibilities beyond conventional boundaries, this role is perfect for you.
Recent achievements include:
Sanity MCP Server – our remote MCP server, now generally available and recognized as one of eight official MCP servers in Vercel's v0.
Sanity Agent Toolkit – agent rules exposed via MCP, including a Claude Code plugin featuring skills and slash commands.
Auto-configured MCP in the CLI – the Sanity CLI now automatically detects code editors and configures the MCP server seamlessly.
Your Responsibilities
Enhance our MCP server – introduce new functionalities, tools, and integrations that enable AI agents to interact more effectively with Sanity's content platform.
Develop agentic developer tools – create systems that can autonomously generate Sanity applications and studios, along with frontend code with minimal human intervention.
Design and execute evaluations – construct evaluation suites utilizing Braintrust to identify regressions, assess improvements, and make data-informed decisions regarding prompt and model modifications.

