Software Engineer, Pre-training Systems at Magic | San Francisco

Magic.devSan Francisco

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

Candidates should possess a strong foundation in software engineering principles and distributed systems. Previous experience in training large models in multi-node GPU setups is crucial, along with a thorough understanding of parallelism strategies and the associated performance trade-offs. Ideal applicants will have a history of debugging complex issues within production machine learning systems and a proactive approach to managing essential infrastructure. A proven ability to enhance system performance or reliability is highly desirable.

About the job

Role Overview

In your role as a Software Engineer on the Pre-training Systems team, you will be responsible for designing and managing the distributed infrastructure necessary for training Magic’s long-context models at scale.

This position emphasizes large-scale model training utilizing extensive GPU clusters. You will operate at the intersection of deep learning and distributed systems, ensuring that training processes are efficient, reliable, and reproducible under extreme conditions.

Magic’s long-context models present complex systems challenges, such as sustained memory usage, communication overhead across thousands of devices, long-duration jobs requiring fault tolerance, and efficient sequence packing within hardware limitations. You will take ownership of the systems that ensure large-scale pre-training is both stable and rapid.

Your Contributions

Scale distributed training across large GPU clusters, implementing data, tensor, and pipeline parallelism.
Optimize communication strategies and gradient synchronization.
Enhance checkpointing, fault tolerance, and job recovery mechanisms.
Profile and resolve performance bottlenecks across computing, networking, and storage.
Advance experiment reproducibility and orchestration workflows.
Boost hardware utilization and overall training throughput.
Collaborate with Kernel and Research teams to align model architecture with system capabilities.

Qualifications We Seek

Solid foundation in software engineering and distributed systems.
Experience with training large models in multi-node GPU environments.
In-depth understanding of parallelism techniques and performance trade-offs.
Experience in debugging cross-layer issues within production ML systems.
Demonstrated ownership mentality and capability to manage critical infrastructure.
Proven track record in enhancing the performance or reliability of large-scale systems.

About Magic.dev

Magic is at the forefront of developing safe AGI, committed to driving progress in addressing humanity's most significant challenges. Our innovative approach integrates advanced machine learning techniques with a vision for a future where technology empowers humanity.

Similar jobs

1 - 20 of 11,573 Jobs

Search for Software Engineer Systems At Braintrust San Francisco

11,573 results

Select all on this page (20)

Apply

Software Engineer, Systems at braintrust | San Francisco

Braintrust

Full-time|On-site|San Francisco

About braintrustBraintrust is at the forefront of AI observability. By merging evaluation and observability into a singular workflow, we empower developers with the insights needed to comprehend AI behavior in production environments, along with the tools to enhance it.Leading teams at Notion, Stripe, Zapier, Vercel, and Ramp utilize Braintrust to compare models, test prompts, and monitor regressions — transforming production data into superior AI with each new release.About the roleWe are in search of a passionate software engineer dedicated to crafting high-performance data processing systems. Our clientele consists of large enterprises handling complex, semi-structured data, which they require for real-time processing and analysis. Our distinct architecture enables these organizations to keep data on-premises while creating intricate visualizations that load without delay. Explore our Brainstore blog post.If you have experience with database systems, compilers, networks, or storage systems and aspire to pivot your expertise into the AI sector, this role could be your ideal fit. You will significantly influence foundational system architecture, technology selection, and implementation. Our founding team possesses extensive knowledge in database and ML systems, and you will have the autonomy to collaborate closely with them while exploring your innovative ideas.Your ResponsibilitiesAs a systems engineer at Braintrust, you’ll contribute to the core systems that empower Braintrust’s capability to process and query vast amounts of unstructured data at an enterprise scale. Key areas of responsibility include:Enhancing the storage, indexing, and query execution performance of Brainstore.Developing Braintrust's btql query language.Optimizing query patterns to boost performance across our platform.QualificationsDeep understanding of systems programming (C++ or Rust, concurrency, databases, operating systems).Experience in founding or working at startups is advantageous.Familiarity with writing prompts or experimenting with GPT models and applications.BenefitsComprehensive medical, dental, and vision insurance.Daily lunch, snacks, and beverages provided.Flexible time off policy.Competitive salary with equity options.

Mar 29, 2024

Apply

Backend Software Engineer at braintrust | San Francisco

Braintrust

Full-time|On-site|San Francisco

About braintrustBraintrust is at the forefront of AI observability, merging evaluation and observability into a seamless workflow. Our platform empowers developers to gain critical insights into AI behavior in production environments and provides tools for optimization.Renowned companies such as Notion, Stripe, Zapier, Vercel, and Ramp leverage Braintrust to assess models, test prompts, and identify regressions, transforming production data into enhanced AI performance with each update.About the RoleWe are seeking a passionate Backend Engineer to contribute to the foundational infrastructure of innovative AI development tools.Braintrust operates as a real-time, highly available data platform within both SaaS and self-hosted ecosystems. Your contributions will be pivotal in scaling our data ingestion pipelines, enhancing our open-source libraries, and ensuring optimal system performance and reliability. If you excel in a dynamic environment and relish tackling complex systems challenges with a product-oriented mindset, we would love to collaborate with you.Your ResponsibilitiesAs a Backend Engineer at Braintrust, you will play a key role in shaping the core platform that facilitates LLM-native development:Design and deploy features that provide users with profound insights into their LLM utilization and performance metrics.Integrate, proxy, cache, and aggregate data from leading model providers such as OpenAI, Anthropic, and Gemini, which our clients depend on.Create and refine robust, efficient open-source libraries for tracing and evaluating LLM calls within client applications. Explore our SDKsDevelop highly available real-time data pipelines to ingest, store, and query extensive volumes of structured and semi-structured AI usage data.You will also collaborate closely with product engineers to deliver polished, comprehensive features that address real customer challenges, including:Bundling and uploading JavaScript and Python code snippets for on-demand function calls in LLM workflows.Facilitating the parsing and uploading of attachments from LLM multimodal outputs for long-term storage.Implementing role-based access controls and data retention protocols for enterprise scenarios.

May 27, 2025

Apply

Cloud Infrastructure Engineer at braintrust | San Francisco

Braintrust

Full-time|On-site|San Francisco

About UsBraintrust is at the forefront of AI observability, seamlessly integrating evaluations and observability into a single workflow. Our platform empowers innovators by providing them with the critical insights needed to understand AI performance in production environments and the tools required to enhance it.Recognized by leading companies such as Notion, Stripe, Zapier, Vercel, and Ramp, Braintrust enables teams to compare AI models, test prompts, and detect regressions, transforming production data into superior AI with each iteration.Role OverviewWe are seeking a talented Cloud Infrastructure Engineer to join our team and contribute to the development of a robust and scalable infrastructure. You will provide developers with a premium platform to deploy code efficiently and confidently. Your role will involve leading initiatives across Terraform, Kubernetes, CI/CD, observability, and support, significantly impacting Braintrust's internal operations and the self-hosted experiences of our customers.This position is pivotal as you will manage our AWS environment while assisting customers in deploying our infrastructure on AWS, Azure, and GCP.Your ResponsibilitiesDevelop and maintain Terraform modules for both internal infrastructure and customer deployments.Engage directly with customers via Slack to assist with self-hosting and troubleshoot infrastructure challenges, creating tools to simplify their support process.Take ownership of our CI/CD pipeline, aiming to reduce build times, enhance failure visibility, and facilitate safer, quicker releases.Centralize and scale observability through logs, metrics, dashboards, and alerts.Collaborate with engineering teams to create and enhance a secure, developer-friendly infrastructure platform.Support multi-cloud deployment strategies, primarily in AWS, while also extending support for Azure and GCP for our enterprise clientele.Implement tools and automation to bolster deployment, rollback, and infrastructure reliability.Ideal Candidate ProfileA minimum of 5 years of experience in DevOps, SRE, or Infrastructure Engineering roles.In-depth knowledge of Terraform and experience with at least one major cloud provider, preferably AWS.Proficient in Kubernetes, with capabilities in deploying, debugging, and scaling real workloads.Strong programming skills in scripting languages like Python, Typescript, or Go.Experience in supporting production systems and managing incidents effectively.Comfortable working closely with customers in a support or deployment capacity.Bonus: Familiarity with monitoring and logging tools, as well as knowledge of security best practices.

Apr 10, 2025

Apply

Technical Recruiter at Braintrust | San Francisco

Braintrust

Full-time|On-site|San Francisco

About BraintrustBraintrust is a cutting-edge AI observability platform that integrates evaluation and observability into a single workflow. By providing builders with unparalleled visibility into AI performance in production, Braintrust empowers teams to enhance AI capabilities. Industry leaders such as Notion, Stripe, Zapier, Vercel, and Ramp rely on Braintrust to analyze models, test prompts, and identify regressions, transforming production data into superior AI outcomes with each release.About the RoleWe are seeking an enthusiastic Technical Recruiter who excels in navigating the uncertainties of early-stage environments and is passionate about influencing Braintrust's growth trajectory. As a product and engineering-centric organization, the quality of our hires plays a pivotal role in our success.In this position, you will collaborate closely with engineering teams and leadership, taking ownership of the full recruitment process for exceptional technical talent. This hands-on role is ideal for those who enjoy developing recruitment systems while staying engaged with the day-to-day work. If you thrive in dynamic environments and wish to significantly impact a company’s direction, we want to hear from you!Key ResponsibilitiesManage the end-to-end recruitment process for engineering positions, emphasizing quality and candidate experience.Work in tandem with engineering leaders to identify roles, assess talent, and make informed hiring decisions.Effectively communicate Braintrust's mission, significance, and current initiatives, acting as a trusted guide for candidates throughout their journey.Identify and implement strategies to enhance our recruiting approach, interview protocols, employer branding, and candidate experience as we scale.Develop and sustain a robust pipeline of elite technical talent, including passive candidates across various disciplines.Meticulously track recruiting metrics to evaluate effectiveness, identify areas for improvement, and guide strategic experiments.Provide actionable insights to leadership regarding market trends, candidate perceptions, and the effectiveness of our narrative.Seek opportunities to optimize workflows and enhance recruiting operations while maintaining high standards.

Dec 30, 2025

Apply

Field Marketing Manager at Braintrust | San Francisco

Braintrust

Full-time|On-site|San Francisco

About BraintrustBraintrust is a cutting-edge AI observability platform that seamlessly connects evaluations and observability within a single workflow. This innovative approach empowers builders with the insights required to comprehend AI behavior in production environments and the necessary tools for enhancement.Our platform is trusted by leading teams at Notion, Stripe, Zapier, Vercel, and Ramp, who utilize Braintrust to compare models, test prompts, and identify regressions—transforming production data into superior AI with every release.Your RoleComprehensive Event ManagementPlan and execute a variety of events including conferences, meetups, hackathons, partner events, webinars, and curated gatherings from conception to completion.Oversee venue selection, run-of-show logistics, vendor management, signage, staffing, shipments, promotional items, hospitality, and demonstrations.Ensure every detail is meticulously executed—timing, flow, aesthetics, and overall attendee experience.Collaborate with the Sales team on high-impact regional programs designed to create and accelerate sales pipeline.Develop pre/post-event workflows and dashboards to track conversion, sourcing, influence, and ROI.Partner with Developer Relations on workshops, demos, hackathons, and community engagements that inspire AI engineers.Amplifying the Braintrust BrandWork with Product Marketing and Design teams to create unforgettable experiences—from exhibition booths to dining events and workshop kits.Utilize your aesthetic judgment to ensure Braintrust makes a lasting impression.Translate our technical narrative into engaging field experiences.Regional Strategy & ExecutionDevelop scalable playbooks for field marketing strategies tailored for both the West and East Coast.Utilize your extensive knowledge of local venues, communities, and partners to enhance event quality.Proactive Problem SolvingAct quickly, remove obstacles, and manage programs independently.Take ownership of deadlines, budgets, vendor relations, and cross-functional coordination.Effectively manage multiple workstreams while maintaining attention to detail.Candidate Profile3–7+ years of experience in field marketing, event management, or go-to-market roles, preferably within B2B SaaS, developer tools, or AI infrastructure sectors.Proven ability to engage technical audiences—developers, machine learning engineers, and data teams—through immersive experiences and compelling content.

Nov 18, 2025

Apply

Sales Development Representative at Braintrust | San Francisco

Braintrust

Full-time|On-site|San Francisco

About BraintrustBraintrust is revolutionizing AI observability. Our platform seamlessly integrates evaluations and observability within a single workflow, empowering developers with the insights necessary to comprehend AI behavior in production environments while providing tools for continuous improvement.Organizations such as Notion, Stripe, Zapier, Vercel, and Ramp utilize Braintrust to analyze models, experiment with prompts, and detect regressions—transforming production data into enhanced AI functionalities with every iteration.Role OverviewWe are seeking motivated Sales Development Representatives to proactively identify and engage potential prospects for Braintrust. The perfect candidate will possess a curious mindset, a genuine interest in the AI technology landscape, and the ability to hold insightful and technical discussions with prospective clients.Key ResponsibilitiesConduct thorough research to identify and connect with prospects across targeted accounts via multi-channel outreach, including phone calls, emails, and LinkedIn communications.Achieve established monthly meeting goals that directly contribute to the company’s revenue growth.Pinpoint key decision-makers and advocates within large enterprise and strategic accounts.Remain informed about industry trends and advancements in AI to effectively engage with prospects.QualificationsBachelor's degree or equivalent practical experience in a relevant discipline.A minimum of 1 year in a sales or comparable role, demonstrating a history of success and high performance.Ability to operate independently in a fast-paced environment, managing multiple priorities with a focus on results.Strong interpersonal and communication skills, adept at engaging and building relationships with a technical audience.A passion for making a meaningful impact and collaborating with a team to achieve key objectives and metrics.BenefitsComprehensive medical, dental, and vision insuranceDaily lunch, snacks, and beverages providedFlexible time off policyCompetitive salary and equity optionsAI Stipend to support professional developmentEqual Opportunity EmployerBraintrust is proud to be an equal opportunity employer. We welcome applicants from all backgrounds without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran status, or disability status.

Jan 2, 2025

Apply

Customer Solutions Architect at braintrust | San Francisco

Braintrust

Full-time|On-site|San Francisco

About UsBraintrust is revolutionizing AI observability. By merging evaluation and observability into a single workflow, we empower developers with the insights necessary to understand AI performance in real-world applications, enabling ongoing enhancements.Our platform is trusted by leading companies like Notion, Stripe, Zapier, Vercel, and Ramp to streamline model comparisons, prompt testing, and regression detection — transforming production data into refined AI with each iteration.Role OverviewWe are seeking a skilled Customer Solutions Architect to join our dynamic Field Engineering team. This pivotal position is highly technical and client-focused, aimed at assisting teams in deploying and managing Braintrust in their production environments.You will collaborate with platform, DevOps, and machine learning infrastructure teams within pioneering AI-centric organizations to ensure Braintrust operates seamlessly—whether in our managed cloud or within their private infrastructures. Your expertise will be essential in evaluating and monitoring LLM-driven applications, enabling effective evaluation workflows through robust infrastructure support. You will assist with deployments, troubleshoot multi-layer issues, and advocate best practices for scalability, security, and maintainability.This role combines solution design, proactive support, and infrastructure advisory, making it perfect for someone passionate about resolving real-world challenges through coding, systems knowledge, and a strong customer-centric approach.Your ResponsibilitiesCustomer Enablement & SupportServe as a trusted advisor for customers deploying and scaling Braintrust in production settings.Lead onboarding and implementation efforts, assisting in the setup of customer-hosted infrastructures, configuration of observability pipelines, evaluation workflows, and integration with third-party applications.Troubleshoot customer issues across various layers (e.g., SDKs, cloud infrastructure, model outputs), coordinating with product and engineering teams as necessary.Manage proactive health checks, upgrade support, and reliability best practices for clients running self-hosted deployments.Solution Design & ImplementationDesign and document comprehensive customer evaluation pipelines utilizing Braintrust SDKs and APIs.Create technical artifacts (scripts, templates, notebooks) that facilitate onboarding and enhance our customer support capabilities, or showcase platform functionalities.Advise on architectural decisions regarding LLM evaluation, prompt versioning, and structural best practices.

May 27, 2025

Apply

Commercial Account Executive - West Region at Braintrust | San Francisco

Braintrust

Full-time|On-site|San Francisco

About BraintrustBraintrust is revolutionizing the AI observability landscape. Our platform uniquely integrates evaluations and observability within a single workflow, empowering developers to gain deep insights into AI performance in production. This visibility enables teams to enhance AI capabilities effectively.Leading companies like Notion, Stripe, Zapier, Vercel, and Ramp leverage Braintrust to refine their models, test prompts, and detect regressions, transforming production data into superior AI with each deployment.Role OverviewWe are seeking a driven Commercial Account Executive to spearhead the acquisition of new business opportunities while nurturing existing client relationships. Your primary focus will be identifying potential clients, cultivating leads, and successfully closing sales.Your ResponsibilitiesContribute to our foundational sales team by shaping our go-to-market strategy and securing pivotal customer accounts.Achieve and surpass quarterly and annual sales targets.Oversee the full sales cycle: from prospecting to sales meetings and negotiations.Manage the complete sales process, engaging potential customers, negotiating deals, ensuring customer retention, and handling renewals.Navigate intricate sales scenarios and build trust through consultative conversations with technical teams.Ideal Candidate Profile1-3+ years of comprehensive sales experience targeting technical audiences.Exceptional communication skills across all channels (in-person, digital, written), fostering strong relationships both physically and virtually.A passion for diving deep into our product, enabling technical discussions with AI-focused teams.Eager to take initiative and drive substantial impact to exceed revenue expectations.Demonstrated strong work ethic, competitive spirit, and the ability to inspire and motivate peers.Thrives in dynamic environments with evolving priorities and deadlines.Perks and BenefitsComprehensive medical, dental, and vision coverageDaily lunches, snacks, and beveragesFlexible paid time offCompetitive salary and equity optionsAI-related stipendDiversity and InclusionBraintrust is proud to be an equal opportunity employer, welcoming applicants from diverse backgrounds and experiences.

Feb 10, 2025

Apply

Brand Lead at Braintrust | San Francisco

Braintrust

Full-time|On-site|San Francisco

About BraintrustBraintrust stands at the forefront of AI observability, seamlessly integrating evaluations and observability into a unified workflow. This innovative platform empowers developers with unparalleled visibility into AI behavior in production, paired with robust tools for ongoing improvement.Leading companies such as Notion, Stripe, Zapier, Vercel, and Ramp leverage Braintrust to analyze models, test prompts, and identify regressions, converting production data into enhanced AI with every iteration.About the RoleWe are in search of a dynamic Brand Lead to redefine how the world perceives Braintrust. In this pivotal role, you will establish and evolve our visual identity, positioning us as the preeminent leader in AI observability. You will create visual systems that infuse our brand identity across all platforms, including our website, product interfaces, advertising materials, and go-to-market strategies.The ideal candidate is a visual storyteller adept at crafting brands that deeply resonate with their audience. You will spearhead our design strategy across various channels, including web, campaigns, events, and promotional materials, translating our mission into a cohesive visual language that engages developers. Collaboration with a multidisciplinary design team and close partnership with marketing and sales will be essential to bring the brand to life.Key ResponsibilitiesDefine and enhance our visual identity, creating a brand that authentically speaks to developers and firmly establishes Braintrust as the go-to platform for AI observability.Lead creative initiatives for significant events, including product launches, landing page designs, and sales campaigns.Drive the creative strategy and execution in collaboration with internal teams and external partners, ensuring every brand expression is purposeful and impactful.Shape the strategic vision for our visual aesthetic, developing systems that scale while upholding high creative standards.Establish the benchmark for our brand experience by defining design principles, guidelines, and systems that guarantee consistency and quality.Adapt quickly and deliver consistently, thriving in a fast-paced environment where speed and craftsmanship coexist.Create brand experiences that feel intuitive, elevated, and tailored for technical audiences.QualificationsA robust portfolio showcasing your ability to develop compelling and cohesive brand experiences.Fluency in design fundamentals, including typography, color theory, and composition, with the discernment to know when to adhere to rules and when to creatively innovate.Demonstrated experience leading creative projects across various formats, ensuring high-quality outputs.

Feb 13, 2026

Apply

Systems Software Engineer at Specter | San Francisco

Specter

Full-time|On-site|San Francisco

Company Overview:Specter is revolutionizing how businesses perceive their physical environments by developing a software-defined control plane. Our mission is to enhance the security of American enterprises by providing them with comprehensive visibility over their physical assets.We are pioneering a connected hardware-software ecosystem that leverages multi-modal wireless mesh sensing technology, reducing the deployment costs and time for sensors by a factor of ten. Our platform aims to be the perception engine for a company’s physical presence, facilitating real-time visibility of perimeters and enabling autonomous operational management.Founded by passionate innovators from Anduril, Tesla, Uber, and the U.S. Special Forces, our co-founders, Xerxes and Philip, are dedicated to empowering our partners in the rapidly evolving landscape of physical AI and robotics.

Oct 3, 2025

Apply

Software Engineer, Systems at Midstream | San Francisco

Midstream

Full-time|On-site|San Francisco

Midstream is an innovative, AI-driven financial operating system tailored for healthcare systems. Founded by a team of seasoned entrepreneurs and supported by prestigious investors, we empower finance, supply chain, and managed care teams with real-time insights into margin risks, enabling them to act swiftly to protect their margins.Designed specifically for the complexities of healthcare, Midstream converts structured, unstructured, and external data into immediate, contract-aware insights that enhance decision-making. Our AI-driven agents integrate spending and revenue operations across the entire back-office, continuously learning and adapting to ensure a level of intelligence that surpasses any standalone solution.We are revolutionizing the pace of healthcare finance, compressing lengthy processes into minutes and transforming retrospective insights into proactive foresight. Midstream is at the forefront of this change.The OpportunityJoin Midstream at a pivotal moment in our growth and contribute to establishing the technical backbone of our platform. In this role, you will operate at the intersection of product development, systems architecture, and cloud infrastructure, crafting and building distributed systems that enable our agile team to operate efficiently as we expand.You will collaborate closely with engineers and leadership to identify current infrastructural pain points and anticipate potential challenges before they arise. From multi-tenant architectures to security protocols, you will transform uncertainty into robust systems that minimize operational burdens and enhance engineering productivity.We are seeking a software engineer with a systems-oriented mindset who is passionate about long-term maintainability and developer experience, while also enjoying the process of creating backend services and delivering tangible software solutions. The ideal candidate will be adept at reasoning about distributed systems, making practical compromises, and developing infrastructure that seamlessly integrates into the background, allowing the team to focus on delivering impactful product value.What You’ll DoDevelop shared platform patterns and tools that enable engineers to launch new backend services and workflows efficiently, securely, and reliably.Enhance the reliability of our production systems by improving observability, debugging capabilities, and resilience as we scale.Architect infrastructure that is clean, reviewable, and repeatable, minimizing unique configurations and facilitating rapid iteration.Establish clear, scalable multi-tenant boundaries across data, compute, and identity to support...

Jan 29, 2026

Apply

Systems Software Engineer at LatchBio | San Francisco

LatchBio

Full-time|On-site|San Francisco

About UsAt LatchBio, we are at the forefront of transforming biological discovery through the fusion of laboratory automation, high-throughput assays, and machine learning. Our innovative platform is designed to store, visualize, and analyze the next wave of scientific discoveries. Trusted by teams across pharmaceutical, biotech, and solution provider sectors, our technology plays a crucial role in enhancing, informing, and delivering groundbreaking products.Our dedicated team of engineers has spent over four years developing and marketing cutting-edge technology in a challenging market that is often hesitant to embrace newcomers. We cater to a diverse clientele with varying product expectations, necessitating close collaboration and nuanced communication with both technical and non-technical users. Our systems routinely handle computational tasks involving multiple terabytes of data.Our commitment and perseverance have resulted in significant market validation, with revenue more than tripling over the past year. Looking ahead, we aim to achieve a sustainable growth trajectory, targeting a repeatable sales process and reaching $50 million in annual recurring revenue (ARR) within the next three years.Explore our core product offerings:Distributed file system with metadata on Postgres and blobs on S3, featuring a web UI and a FUSE driver.Workflow orchestrator built on Kubernetes.On-demand interactive compute instances based on Kubernetes containers.Statically-typed tabular data storage engine.Reactive Python-based web application framework for data analysis and visualization.Upcoming: A cluster orchestrator and workflow engine designed to accept compute nodes from anywhere on the internet.While various startups focus on niche solutions, our comprehensive approach sets us apart in this tech-heavy industry.

Jun 22, 2024

Apply

Data Analytics Engineer at braintrust | San Francisco

Braintrust

Full-time|On-site|San Francisco

About the CompanyBraintrust is at the forefront of AI observability. By seamlessly integrating evaluations and observability in a unified workflow, Braintrust empowers developers with the insights necessary to comprehend AI behavior in production and provides tools to enhance it.Renowned teams from Notion, Stripe, Zapier, Vercel, and Ramp utilize Braintrust to analyze models, test prompts, and detect regressions, transforming production data into superior AI with every iteration.About the RoleWe are seeking a highly skilled and proactive data professional to take charge of our data infrastructure and pipeline. This individual will possess extensive technical expertise in data engineering and the initiative to address challenges throughout the entire data stack. If you excel in problem-solving, are eager to learn new technologies, and take action without waiting for approval, this position is perfect for you.What You'll DoDevelop and maintain robust data pipelines that drive our analytics and business operations.Oversee our Snowflake and dbt infrastructure—managing data warehouse architecture, optimizing performance, and ensuring clean, well-documented models.Establish and troubleshoot data connectors across diverse sources and systems.Provide rapid analytics and dashboards to address business inquiries.Facilitate collaboration and communication to remove obstacles through coding or teamwork.Experiment with new tools and technologies, even those you haven't previously encountered.What We're Looking ForProven track record in data engineering—your portfolio is as important as your years of experience.In-depth knowledge of Snowflake, dbt management, and various data connectivity solutions.Comfortable with setting up data connectors, generating quick analytics, and crafting dashboards across the data stack.Exhibit high agency—identify issues, propose solutions, and deliver results independently.Skilled in unblocking yourself and others through effective communication and collaboration or by diving into code.Willing to explore uncharted territories, tackling new challenges head-on.Adaptable to ambiguity and adept at navigating problems as they arise.Benefits IncludeMedical, dental, and vision insurance401k

Jan 16, 2026

Apply

Software Engineer, AI Systems at Sazabi | San Francisco

Sazabi

Full-time|Remote|San Francisco

Join Our Innovative Team at SazabiAs we approach the year 2026, the tech world faces a looming "infinite software crisis." How do we effectively support, maintain, and manage the vast surge in application development?Our solution is Sazabi: the AI-native observability platform designed specifically for dynamic engineering teams.Sazabi empowers teams to inquire about their production systems in straightforward language, visualize operations automatically, and identify root causes up to 10 times faster. Forget about tedious instrumentation, complex dashboard setups, and alert configurations—just get the answers you need.We are proud to be supported by innovators from industry-leading AI companies, including Vercel, Graphite, Daytona, Browserbase, LangChain, and Replit.

Mar 23, 2026

Apply

System Software Engineer at The Bot Company | San Francisco

The Bot Company

Full-time|On-site|San Francisco

The Bot CompanyAt The Bot Company, we are revolutionizing the home experience by creating an intelligent robot that serves as a helpful companion in every household.Our dynamic team, comprised of talented engineers, designers, and operators, is based in the heart of San Francisco. We boast a rich background with team members previously at industry giants such as Tesla, Cruise, OpenAI, Google, and Pixar. Collectively, we have a proven track record of delivering groundbreaking products to millions of users, understanding deeply what it takes to craft exceptional experiences.We pride ourselves on a streamlined team structure that fosters quick decision-making and eliminates unnecessary bureaucracy. Each team member is empowered to take ownership of their work and has the autonomy to drive their projects forward. Our culture encourages rapid iteration and agile execution across all levels of development.

Jan 13, 2026

Apply

Software Engineer - Realtime Systems at Baseten | San Francisco

Baseten

Full-time|On-site|San Francisco

Baseten supports companies like Cursor, Notion, and Writer in running AI inference at scale. The team blends AI research, adaptive infrastructure, and developer tools to help organizations deploy advanced AI models efficiently. Backed by investors such as BOND, IVP, and Greylock, Baseten recently raised a $300M Series E. The company aims to be the trusted platform for engineers launching AI products. Role overview The Software Engineer - Realtime Systems (Voice AI) role focuses on building and deploying production-ready Voice AI systems. Baseten’s Voice AI team works with open-source models to power applications in productivity, customer support, clinical conversations, creative tools, and education. Engineers in this group influence how people use voice to interact with technology, shaping products that impact multiple industries. This position involves leading Voice AI projects, setting both product direction and technical strategy. Collaboration is a key part of the work: expect to partner with Forward Deployed Engineers, Model Performance Engineers, and other teams to advance Baseten’s Voice AI capabilities. Sample projects The world's fastest Whisper, with streaming and diarization Orpheus TTS inference partnership with Canopy Labs Collaborate with the Core Product team to build a multi-model voice agent using Baseten’s orchestration framework Work alongside the Training Platform team to support ongoing training of voice models Design APIs and SDKs that make Baseten Voice AI products accessible for developers Location This role is based in San Francisco.

Apr 26, 2026

Apply

Evaluation Engineer

Braintrust

Full-time|Remote|San Francisco

Join our dynamic team as an Evaluation Engineer at Braintrust, a leading talent network that empowers companies to harness the expertise of top talent. In this role, you will be responsible for developing and implementing evaluation frameworks to assess various projects and initiatives. You will work closely with cross-functional teams to ensure alignment with our strategic objectives and contribute to data-driven decision-making processes.

Mar 13, 2026

Apply

Software Engineer - Database Systems at Cartesia | San Francisco, CA

Cartesia

Full-time|On-site|*HQ - San Francisco, CA

About CartesiaAt Cartesia, we are on a mission to revolutionize artificial intelligence by creating interactive intelligence that is accessible and effective in any environment. We have identified a gap in current AI capabilities; existing models struggle to continuously process extensive streams of audio, video, and text data. Our vision is to bridge this gap by developing pioneering model architectures.Founded by PhD experts from the Stanford AI Lab, we are the creators of State Space Models (SSMs), a groundbreaking approach to training efficient, large-scale foundation models. Our team merges profound expertise in model innovation with systems engineering and product design to deliver advanced models and user experiences.Backed by leading investors such as Index Ventures and Lightspeed Venture Partners, along with an array of esteemed advisors, we are well-positioned to push the boundaries of AI.About the RoleWe are seeking a talented Software Engineer specializing in database systems to architect and scale Cartesia’s data infrastructure. You will play a crucial role in implementing robust data governance and developing user-friendly, secure database tools that empower both engineers and non-engineers.Your ImpactDesign and enhance database platforms to ensure scalability to over 100 times current capacity while maintaining uptime, latency, and accuracy.Construct data storage architectures that function seamlessly across various environments including AWS, GCP, on-premises systems, and third-party deployments.Facilitate accelerated development across the organization by providing high-quality database tools and resources to both technical and non-technical users.Implement secure access control mechanisms to ensure sensitive data is restricted to authorized personnel only.Develop scalable data governance systems focused on permissions, auditing, and compliance, utilizing IAM policies, ACLs, and security controls across a large user base.What You BringExpertise with cloud services such as AWS, GCP, or Azure, along with experience using infrastructure-as-code tools like Terraform.A proven history of managing database systems during periods of rapid growth in dynamic environments.

Feb 3, 2026

Apply

Software Engineer, Pre-training Systems at Magic | San Francisco

Magic.dev

Full-time|On-site|San Francisco

At Magic, we are dedicated to creating safe artificial general intelligence (AGI) that propels humanity forward in tackling the most pressing global challenges. We believe that the most effective route to achieving safe AGI involves automating the research and code generation processes to enhance models and resolve alignment issues more reliably than humans can achieve independently. Our methodology incorporates cutting-edge pre-training at scale, domain-specific reinforcement learning (RL), ultra-long context capabilities, and optimized inference-time computations.Role OverviewIn your role as a Software Engineer on the Pre-training Systems team, you will be responsible for designing and managing the distributed infrastructure necessary for training Magic’s long-context models at scale.This position emphasizes large-scale model training utilizing extensive GPU clusters. You will operate at the intersection of deep learning and distributed systems, ensuring that training processes are efficient, reliable, and reproducible under extreme conditions.Magic’s long-context models present complex systems challenges, such as sustained memory usage, communication overhead across thousands of devices, long-duration jobs requiring fault tolerance, and efficient sequence packing within hardware limitations. You will take ownership of the systems that ensure large-scale pre-training is both stable and rapid.Your ContributionsScale distributed training across large GPU clusters, implementing data, tensor, and pipeline parallelism.Optimize communication strategies and gradient synchronization.Enhance checkpointing, fault tolerance, and job recovery mechanisms.Profile and resolve performance bottlenecks across computing, networking, and storage.Advance experiment reproducibility and orchestration workflows.Boost hardware utilization and overall training throughput.Collaborate with Kernel and Research teams to align model architecture with system capabilities.Qualifications We SeekSolid foundation in software engineering and distributed systems.Experience with training large models in multi-node GPU environments.In-depth understanding of parallelism techniques and performance trade-offs.Experience in debugging cross-layer issues within production ML systems.Demonstrated ownership mentality and capability to manage critical infrastructure.Proven track record in enhancing the performance or reliability of large-scale systems.

Feb 28, 2026

Apply

Distributed Systems Engineer at archil | San Francisco

Archil

Full-time|On-site|San Francisco

Role OverviewJoin our innovative team as a Distributed Systems Engineer at Archil, where you will play a pivotal role in developing cutting-edge storage solutions. You will work across the entire technology stack, tackling challenges as they arise and significantly shaping our product's technical and strategic direction.Your responsibilities will include:Being on-call for our production systems to assist customers promptly in case of issues.Innovating and implementing unprecedented features in our storage services.Designing interactions within distributed systems to ensure atomicity and idempotency.Deploying and standardizing infrastructure across various cloud environments.Navigating evolving customer requirements amidst ambiguity.

Jun 2, 2025

Create account — see all 11,573 results