Staff Machine Learning Research Scientist - LLM Evaluations
Scale AISan Francisco, CA; Seattle, WA; New York, NY
On-site Full-time $280K/yr - $380K/yr
Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Unlock Your Potential
Generate Job-Optimized Resume
One Click And Our AI Optimizes Your Resume to Match The Job Description.
Is Your Resume Optimized For This Role?
Find Out If You're Highlighting The Right Skills And Fix What's Missing
Experience Level
Mid to Senior
Qualifications
You will: Lead investigations into the effectiveness and limitations of current LLM evaluation techniques. Design and implement innovative evaluation benchmarks for large language models, focusing on instruction adherence, factual accuracy, robustness, and fairness. Build and maintain strong relationships with clients and cross-functional teams to drive collaborative projects. Work alongside internal teams and external partners to refine evaluation metrics and develop standardized protocols. Create scalable and reproducible evaluation pipelines utilizing modern machine learning frameworks. Publish findings in prestigious AI conferences and contribute to open-source benchmarking efforts. Mentor and lead research scientists and engineers, providing technical guidance across various projects. Engage actively with the ML research community to stay updated on emerging developments and contribute to the advancement of LLM evaluation science. Excel in a dynamic, fast-paced startup environment and commit to achieving impactful results.
About the job
At Scale AI, we are the premier partner for data and evaluation in the rapidly evolving field of artificial intelligence. Our commitment to advancing the assessment and benchmarking of large language models (LLMs) positions us at the forefront of AI innovation. We are dedicated to creating leading-edge LLM evaluation methodologies that set new benchmarks for model performance.
Our research teams collaborate with the top AI laboratories in the industry to provide high-quality data, accelerate progress in generative AI research, and inform what excellence looks like in this domain. As a Staff Machine Learning Research Scientist on our LLM Evals team, you will spearhead the creation of novel evaluation methodologies, metrics, and benchmarks to assess the strengths and weaknesses of cutting-edge LLMs. Your work will shape our internal strategies and influence the broader AI research community, making this role essential for establishing best practices in data-driven AI development.
About Scale AI
Scale AI is recognized as a leader in providing data and evaluation solutions for next-generation AI technologies. Our mission is to enhance the evaluation and benchmarking of large language models, ensuring fairness, scalability, and rigor in assessment methodologies.
Similar jobs
1 - 20 of 1,349 Jobs
Search for Research Scientist Secure Intelligence Institute
At Perplexity, we are on the lookout for passionate researchers and engineers to join our pioneering Secure Intelligence Institute (SII). As our primary research hub, SII is dedicated to enhancing security, privacy, and trust within the realm of frontier intelligence. Our mission focuses on pushing the boundaries of AI security research, implementing significant enhancements in Perplexity's systems, and disseminating insights that bolster the wider AI ecosystem.As a member of SII, your role will involve undertaking original and influential research aimed at bolstering the security and privacy of advanced intelligence systems. You will strive to ensure that your research is not only theoretically sound but also pragmatically applicable to improve systems that are relied upon daily by millions of users and thousands of businesses. You will be expected to effectively translate your research findings, as well as advancements from the broader research community, into actionable improvements that safeguard Perplexity's users.
OverviewBecome an integral part of our dynamic R&D team dedicated to developing fully automated research systems that push the boundaries of AI. Zochi has achieved a milestone by publishing the first entirely AI-generated A* conference paper. Locus has set a new industry standard as the first AI system to surpass human experts in AI R&D.Key ResponsibilitiesConceptualize and develop innovative architectures for automated research.Work collaboratively within a specialized team of researchers addressing cutting-edge challenges in long-horizon agentic capabilities, post-training for open-ended objectives, and environment crafting.Document and publish key internal findings alongside success stories from external collaborations.QualificationsPhD or equivalent research experience in Computer Science, Machine Learning, Artificial Intelligence, or a related discipline. Outstanding candidates with significant research contributions are encouraged to apply, regardless of formal qualifications.Demonstrated history of impactful AI/ML research contributions in academic or corporate environments.Expertise in developing long-horizon, multi-agent systems and/or model post-training, especially in scientific domains or for open-ended discovery objectives.A strong passion for advancing problem-solving processes and scientific discovery, thriving in high-autonomy roles and environments.Our CultureCompetitive compensation and equity options.Unlimited Paid Time Off (PTO), emphasizing team collaboration and a community-focused workplace.Opportunities for conference participation and engagement in community initiatives.Empowered roles with high levels of responsibility.#1: We are a small, passionate team of leading investors, researchers, and industry experts committed to the mission of accelerating discovery. Join us.
Full-time|On-site|San Francisco Bay Area (San Mateo) or Boston (Somerville)
About the RoleAs a Research Scientist focused on Pretraining, you will develop the foundational intelligence layer for robotics. Our mission involves training expansive robot foundation models utilizing vast multimodal datasets that encompass video, proprioception, action traces, language, and beyond. You will lead and execute large-scale training initiatives that imbue our models with groundbreaking general capabilities applicable across various embodiments, tasks, and environments. Your work will involve deeply engaging with all facets of robotic data.Key Responsibilities:Design and conduct extensive pretraining efforts for robot foundation models, employing transformer and diffusion architectures.Establish model architectures, objectives, and training curricula that leverage multimodal robotic data, including vision, action, state, and language inputs.Create scalable data mixtures and sampling strategies to effectively utilize petabyte-scale datasets.Direct data collection operations and explore new avenues for dataset sourcing.Conduct ablation studies to uncover insights regarding scaling laws, data quality impacts, and architectural trade-offs.Collaborate closely with ML Infrastructure and Systems teams to enhance cluster utilization, throughput, and reliability.Transform raw robotic interaction data into versatile model capabilities.Ideal Candidate Profile:Extensive experience in training large transformer or diffusion models at scale, particularly in generative tasks such as language, audio, or video modeling.Proven leadership or significant contribution to multi-node, multi-GPU distributed training initiatives.Experience with scaling laws, optimization dynamics, and understanding large-model failure modes.Strong foundation in PyTorch and comfort in debugging across all layers of the computational stack.Appreciation for empirical rigor paired with rapid iteration speed.Enthusiasm for building general-purpose robot intelligence from foundational principles.About GeneralistAt Generalist, we are dedicated to realizing the potential of general-purpose robots. We envision a future where industries and households thrive through innovative collaborations between humans and machines. Our robots are designed to enhance productivity and facilitate the achievement of more ambitious goals.
At Causal Labs, we are on a groundbreaking mission to develop general causal intelligence—artificial intelligence that not only predicts future events but also determines the most effective actions to influence those outcomes.To achieve this monumental goal, we are constructing a Large Physics Foundation Model (LPM). Our focus is on domains governed by physical laws, which inherently exhibit cause-and-effect relationships, setting them apart from traditional visual or textual data.Weather serves as the ideal training environment for our LPM, being one of the most extensively observed physical systems available. It provides immediate, objective feedback from sensory observations and boasts data scales significantly larger than those currently employed to train existing language models.Our team at Causal Labs includes leading researchers and engineers with backgrounds in self-driving technology, drug discovery, and robotics, hailing from prestigious organizations such as Google DeepMind, Cruise, Waymo, Meta, Nabla Bio, and Apple. We firmly believe that achieving general causal intelligence will represent one of the most critical technological advancements for our civilization.We are seeking innovative researchers eager to confront unsolved challenges in the field.This role presents an opportunity to create powerful models rooted in observable feedback and verifiable ground truths. If you possess experience in pioneering research and training large-scale models from the ground up in areas such as language and vision models, robotics, or biology, we invite you to join our mission.
Join us at Physical Intelligence as a Research Scientist, where you will be at the forefront of innovation in machine learning and robotics. We are in search of exceptional researchers across all experience levels who demonstrate a strong track record of impactful research results. Ideal candidates will possess a solid foundation in both practical implementation and theoretical frameworks, showcasing a blend of system-building capabilities and significant conceptual, algorithmic, or theoretical advancements. We value diverse backgrounds and encourage applications from both traditional academic researchers and those with unique, unconventional experiences.We are committed to fostering a diverse and inclusive workplace. In accordance with the San Francisco Fair Chance Ordinance, we welcome applications from qualified individuals with arrest and conviction records.
Join OpenAI as a Research Scientist and explore cutting-edge machine learning innovations. In this role, you will be at the forefront of developing groundbreaking techniques while advancing our team's research initiatives. Collaborate with talented peers across various teams to discover transformative ideas that scale effectively. We seek individuals who are passionate about pushing the boundaries of AI and want to contribute to our unified research vision.
Full-time|$225K/yr - $300K/yr|On-site|San Francisco
About Latent Health Latent Health is building personalized clinical intelligence to make healthcare more accessible and effective for everyone. The company’s mission is to ensure medical history is connected and actionable, not fragmented or reserved for a privileged few. By combining broad clinical knowledge with detailed patient histories, Latent Health aims to improve care for all populations. The team works with a diverse dataset that includes complex and chronic cases, providing deep insight into a wide spectrum of clinical scenarios. Their models are designed to answer nuanced clinical questions and offer patient-specific reasoning. Machine Learning at Latent Health The Machine Learning group integrates advanced systems directly into clinical workflows. Current projects include: Scaling verifiable reinforcement learning Mid-training and post-training foundational models Creating new objectives from longitudinal patient data Team members work in a small, agile environment, taking responsibility for major systems and seeing research through from early-stage problem definition to validated, real-world results. Role Overview: Research Scientist The Research Scientist will design and develop new modeling techniques that advance clinical intelligence. This role leads research efforts from ambiguous starting points to validated solutions, shaping how models learn from patient data and influencing the future of healthcare delivery. Location San Francisco
Merge Labs is an innovative research facility dedicated to merging biological sciences and artificial intelligence to enhance human capability, autonomy, and experience. Our mission is to pioneer revolutionary methodologies in brain-computer interfaces that facilitate high-bandwidth interactions with the brain, seamlessly integrate advanced AI, and maintain safety and accessibility for all users.About the TeamAt Merge, we are addressing some of the most ambitious challenges in molecular engineering, synthetic biology, and neuroscience. Our Research Platform Team is responsible for creating the experimental frameworks necessary to tackle these challenges with exceptional speed and precision. The tools and methodologies developed by our team significantly enhance molecular assembly, protein expression, mammalian cell culture, advanced microscopy, sequencing, and unique custom techniques. We collaborate with program teams to establish and optimize these capabilities, implement automation where beneficial, and integrate with our data science and machine learning pipelines, continuously pushing the boundaries of throughput and innovation.About the RoleAs a Platform Scientist, you will be instrumental in developing high-efficiency and high-throughput experimental pipelines that accelerate research initiatives. You will work closely with program leads, project scientists, data scientists, and engineers, leading your work and potentially recruiting additional team members as necessary.Key Responsibilities:Collaborate with program leads and scientists to identify critical experimental requirements and workflows.Develop processes to facilitate high-throughput and/or high-efficiency experiments, including reagent production and analysis.Scope, procure, construct, program, and validate instruments to support experimental workflows.Ensure the quality, reliability, and integrity of data generated from automated pipelines, including defining and implementing suitable quality control checkpoints.Work alongside data science and machine learning engineers to incorporate metadata tracking, computational design, and analysis into experimental pipelines.Partner with electrical, mechanical, and software engineers to create custom setups.Innovate and validate concepts to enhance experimental throughput.
About the TeamJoin the innovative Post-Training team at OpenAI, where we focus on refining and elevating pre-trained models for deployment in ChatGPT, our API, and future products. Collaborating closely with various research and product teams, we conduct crucial research that prepares our models for real-world deployment to millions of users, ensuring they are safe, efficient, and reliable.About the RoleAs a Research Engineer / Scientist, you will spearhead the research and development of enhancements to our models. Our work intersects reinforcement learning and product development, aiming to create cutting-edge solutions.We seek passionate individuals with robust machine learning engineering skills and research experience, particularly with innovative and powerful models. The ideal candidate will be driven by a commitment to product-oriented research.This position is located in San Francisco, CA, and follows a hybrid work model requiring three days in the office each week. Relocation assistance is available for new employees.In this role, you will:Lead and execute a research agenda aimed at enhancing model capabilities and performance.Work collaboratively with research and product teams to empower customers to optimize their models.Develop robust evaluation frameworks to monitor and assess modeling advancements.Design, implement, test, and debug code across our research stack.You may excel in this role if you:Possess a deep understanding of machine learning and its applications.Have experience with relevant models and methodologies for evaluating model improvements.Are adept at navigating large ML codebases for debugging purposes.Thrive in a fast-paced and technically intricate environment.About OpenAIOpenAI is a pioneering AI research and deployment organization dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We are committed to pushing the boundaries of AI capabilities while prioritizing safety and human-centric values in our products. Our mission is to embrace diverse perspectives, voices, and experiences that represent the full spectrum of humanity, as we strive for a future where AI is a powerful ally for everyone.
About Our TeamJoin the forefront of AI innovation with the RL and Reasoning team at OpenAI. Our team is dedicated to advancing reinforcement learning research and has pioneered transformative projects, including o1 and o3. We are committed to pushing the limits of generative models while ensuring their scalable deployment.About the RoleAs a Research Engineer/Research Scientist at OpenAI, you will play a pivotal role in enhancing AI alignment and capabilities through state-of-the-art reinforcement learning techniques. Your contributions will be essential in training intelligent, aligned, and versatile agents that power various AI models.We seek individuals with a solid foundation in reinforcement learning research, agile coding skills, and a passion for rapid iteration.This position is located in San Francisco, CA, and follows a hybrid work model of three days in the office per week. We also provide relocation assistance for new hires.You may excel in this role if:You are enthusiastic about being at the cutting edge of RL and language model research.You take initiative, owning ideas and driving them to fruition.You value principled methodologies, conducting simple experiments in controlled environments to draw trustworthy conclusions.You thrive in a fast-paced, complex technical environment where rapid iteration is essential.You are adept at navigating extensive ML codebases to troubleshoot and enhance them.You possess a profound understanding of machine learning and its applications.About OpenAIOpenAI is a pioneering AI research and deployment organization committed to ensuring that general-purpose artificial intelligence serves the greater good for humanity. We strive to push the boundaries of AI system capabilities while prioritizing safe deployment through our innovative products. We recognize AI as a powerful tool that must be developed with safety and human-centric principles, embracing diverse perspectives to reflect the full spectrum of humanity.We are proud to be an equal opportunity employer, welcoming applicants from all backgrounds without discrimination based on race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or any other legally protected characteristic.
Full-time|$280K/yr - $380K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY
At Scale AI, we are the premier partner for data and evaluation in the rapidly evolving field of artificial intelligence. Our commitment to advancing the assessment and benchmarking of large language models (LLMs) positions us at the forefront of AI innovation. We are dedicated to creating leading-edge LLM evaluation methodologies that set new benchmarks for model performance. Our research teams collaborate with the top AI laboratories in the industry to provide high-quality data, accelerate progress in generative AI research, and inform what excellence looks like in this domain. As a Staff Machine Learning Research Scientist on our LLM Evals team, you will spearhead the creation of novel evaluation methodologies, metrics, and benchmarks to assess the strengths and weaknesses of cutting-edge LLMs. Your work will shape our internal strategies and influence the broader AI research community, making this role essential for establishing best practices in data-driven AI development.
Join Our Team as a Research ScientistAt Parallel, we are at the forefront of web infrastructure innovation, enabling businesses across sectors such as sales, marketing, insurance, and technology to harness the power of AI. Our state-of-the-art products empower users to develop superior AI agents with seamless and flexible access to the web.With significant backing of $130 million from prominent investors like Kleiner Perkins, Index Ventures, and Spark Capital, we are dedicated to redefining the web for artificial intelligence. As we expand, we're assembling a top-tier team of engineers, designers, marketers, sales experts, researchers, and operational specialists committed to our vision.Your Role: As a Research Scientist, you will tackle the challenge of training and scaling models designed to enhance web indexing capabilities.About You: You possess a profound understanding of contemporary models and training methodologies. You enjoy engaging in discussions about the convergence of search, recommendations, and transformer models, and are passionate about translating your research into impactful products and systems utilized by millions.
Zyphra is a pioneering artificial intelligence firm located in the vibrant city of San Francisco, California.About the Role:We are seeking a passionate Research Scientist to join our dynamic Agency and Reasoning Team at Zyphra. In this role, you will conduct cutting-edge research in reinforcement learning, post-training methodologies, and human preference learning. Your innovative ideas will be instrumental in shaping our next-generation language models, enabling their application on a large scale.What We Desire:A strong sense of research intuition and tasteCapability to navigate a research project from initial concept to execution and documentationProficiency in implementation and prototypingA quick thinker who can rapidly transform ideas into experimental frameworksAbility to collaborate effectively in a fast-paced research environmentAn insatiable curiosity and enthusiasm for the study of intelligence.Qualifications:Proven experience and skill in reinforcement learning, particularly in the context of language model reasoning or classical RL tasksFamiliarity with language-model-supervised fine-tuning and preference-learning techniques, such as DPO and simPO.Experience with methods for context-length extensionStrong intuitive understanding of model behaviors, with the ability to refine them through iterative fine-tuningInterest in engaging deeply with data and dedicating time to data engineering and synthetic data generationA postgraduate degree in a scientific discipline (Computer Science, Electrical Engineering, Mathematics, Physics)Published research in reputable machine learning venuesExpertise in PyTorch and PythonEagerness and aptitude for rapidly acquiring new knowledge and implementing innovative conceptsExceptional communication and teamwork abilities, capable of contributing to both research and large-scale engineering effortsWhy Join Zyphra?We champion creative and unconventional ideas and are prepared to invest significantly in innovative concepts.Our culture fosters collaboration, curiosity, and intellectual growth.
Zyphra is a cutting-edge artificial intelligence firm headquartered in the vibrant city of San Francisco, California.Position Overview:As a Research Scientist specializing in Model Architectures, you will play a pivotal role in Zyphra’s AI Architecture Research Team. Your responsibilities will include the design and thorough evaluation of innovative model architectures and training methodologies aimed at enhancing essential modeling capabilities (e.g., loss per flop or loss per parameter) and tackling core limitations inherent in current models. You will collaborate closely with our pre-training team to ensure that your findings are seamlessly integrated into our next-generation models.Qualifications:A strong research acumen and intuition.Proven ability to navigate research projects from initial conception to execution and final write-up.Exceptional implementation and prototyping skills, with the capability to swiftly transform ideas into experimental outcomes.A collaborative spirit and the ability to thrive in a fast-paced research environment.A deep curiosity and enthusiasm for understanding intelligence.Requirements:Experience with long-term memory, RAG/retrieval systems, dynamic/adaptive computation, and alternative credit assignment strategies.Knowledge of reinforcement learning, control theory, and signal processing techniques.A passion for exploring and critically evaluating unconventional ideas, with the ability to maintain a unique perspective.Familiarity with modern training pipelines and the hardware necessities for designing efficient architectures compatible with GPU hardware.Strong understanding of experimental methodologies for conducting rigorous ablations and hypothesis testing.High proficiency in PyTorch and Python programming.Ability to quickly assimilate into large pre-existing codebases and contribute effectively.Prior publication of machine learning research in reputable venues.Postgraduate degree in a scientific discipline (e.g., Computer Science, Electrical Engineering, Mathematics, Physics).Why Join Zyphra?We emphasize a structured research methodology that systematically addresses ambitious challenges in AI.
Join Mindlance as a Cyber Intelligence Security Analyst and be at the forefront of safeguarding our digital assets. In this role, you will leverage your analytical skills to monitor, assess, and mitigate security threats while collaborating with cross-functional teams to enhance our cyber defense strategies. This position offers a unique opportunity to contribute to the security posture of a leading organization in a dynamic environment.
About AfterQuery AfterQuery partners with leading AI labs to advance training data and evaluation frameworks. The team builds high-signal datasets and runs thorough evaluations that go beyond standard benchmarks. As a post-Series A, early-stage company in San Francisco, AfterQuery gives each team member room to shape the future of AI models. Role Overview: Research Scientist - Frontier Data This role focuses on designing datasets and developing evaluation systems that influence how top AI models are trained and assessed. Working closely with research teams at major AI labs, the scientist explores new data collection techniques, investigates where models fall short, and sets up metrics to track progress. The work is hands-on and experimental, moving quickly from hypothesis to live testing and directly impacting large-scale model training. Key Responsibilities Design data slides and analyze data structures to uncover model weaknesses in areas like finance, software development, and enterprise operations. Build and refine evaluation rubrics and reward signals for RLHF and RLVR training approaches. Study annotator behavior and run experiments to improve model capabilities across different domains. Develop quantitative frameworks to measure dataset quality, diversity, and their effect on model alignment and performance. Work with research teams to turn training objectives into concrete data and evaluation needs. What We Look For Experience as an undergraduate or master’s research student (PhD not required). Background or internships with RL environments or AI safety and benchmarking organizations (e.g., METR, Artificial Analysis) is a strong plus. Genuine interest in how data structure, selection, and quality affect model outcomes. Demonstrated skill in designing experiments, acting quickly, and extracting insights from complex data. Comfort working across sectors such as finance, software engineering, and policy. Strong quantitative background and familiarity with LLM training pipelines, RLHF/RLVR methods, or evaluation frameworks. A hands-on mindset focused on building practical solutions.
About Our TeamJoin the Foundations Research team, where we tackle ambitious and innovative projects that could redefine the future of AI. Our mission is to enhance the science behind our training and scaling initiatives, focusing on pioneering frontier models. We are dedicated to advancing data utilization, scaling methodologies, optimization strategies, model architectures, and efficiency enhancements to accelerate our scientific breakthroughs.About the PositionWe are on the lookout for a dynamic technical research lead to spearhead our embeddings-focused retrieval initiatives. You will oversee a talented team of research scientists and engineers committed to developing foundational technologies that enable models to access and utilize the right information precisely when needed. This includes crafting innovative embedding training objectives, architecting scalable vector storage, and implementing adaptive indexing techniques.This pivotal role will contribute to various OpenAI products and internal research initiatives, offering opportunities for scientific publication and significant technical influence.This position is located in San Francisco, CA, where we embrace a hybrid work model, requiring three days in the office weekly, and we provide relocation assistance for new hires.Your ResponsibilitiesLead cutting-edge research on embedding models and retrieval systems optimized for grounding, relevance, and adaptive reasoning.Supervise a team of researchers and engineers in building an end-to-end infrastructure for training, evaluating, and integrating embeddings into advanced models.Drive advancements in dense, sparse, and hybrid representation techniques, metric learning, and retrieval systems.Work collaboratively with Pretraining, Inference, and other Research teams to seamlessly integrate retrieval throughout the model lifecycle.Contribute to OpenAI's ambitious vision of developing AI systems with robust memory and knowledge access capabilities rooted in learned representations.You Will Excel in This Role If You PossessA proven track record of leading high-performance teams of researchers or engineers within ML infrastructure or foundational research.In-depth technical knowledge in representation learning, embedding models, or vector retrieval systems.Familiarity with transformer-based large language models and their interaction with embedding spaces and objectives.Research experience in areas such as contrastive learning and retrieval-augmented generation.
Full-time|$188K/yr - $254K/yr|On-site|San Francisco, Boston, New York, Denver
About SemgrepSemgrep is at the forefront of code security for developers, enabling innovative work without compromising safety. Our platform allows teams to identify, report, and rectify genuine issues before deployment, supported by an intelligent security system that evolves alongside development. Semgrep enhances code security as it is authored, providing essential guardrails that allow developers to operate swiftly while maintaining security. Built for creators and endorsed by security professionals, our solution integrates seamlessly into developers' workflows, delivering solutions that preserve productivity while granting security teams enhanced oversight, control, and assurance. As Semgrep evolves, our AI adapts to your context, minimizing false positives and prioritizing actionable vulnerabilities, a claim validated by 95% of security reviewers across over 6 million findings. We are committed to making zero false positives a reality, enabling AppSec teams to manage 80% fewer false alarms across Code and Supply Chain, significantly reducing backlog.Founded in San Francisco and supported by investors such as Menlo Ventures, Felicis Ventures, Lightspeed Venture Partners, Redpoint Ventures, and Sequoia Capital, Semgrep has been acknowledged by Gartner in Application Security Testing and is trusted by top-tier organizations like Snowflake, Dropbox, and Figma. Discover more at semgrep.dev.About the RoleAs the Security Research Manager for the Coverage Team, you will spearhead a group of Security Researchers dedicated to enhancing detection rules for Secrets, Code, and Supply Chain across all Semgrep products. Your responsibilities will include:Crafting high-quality detection rulesInnovating research and automation techniques to expedite and enhance rule creationEvaluating and elevating the overall quality and scope of detectionsIn this managerial role, you will report directly to the Head of Security Research. You will define the strategic roadmap, collaborate with Product Management to concentrate on the most impactful detection areas, and drive ongoing enhancements in both detection accuracy and coverage breadth. Achieving success in this position means leading a team that produces exceptional detections, scales rule generation through automation and AI, and expands the limits of contemporary vulnerability research.Your Responsibilities:Recruit, mentor, and nurture your team, fostering a productive, engaging, diverse, and inclusive workplace that aligns with Semgrep's core valuesCollaborate closely with product management, sales, and development teams across all product linesAnalyze, measure, and enhance the velocity and quality of Semgrep detections
Join the Center for AI Safety (CAIS), a pioneering research and advocacy organization dedicated to addressing the societal-scale risks posed by artificial intelligence. We tackle the most pressing challenges in AI through rigorous technical research, innovative field-building initiatives, and proactive policy engagement, in collaboration with our sister organization, the Center for AI Safety Action Fund.As a Research Scientist, you will spearhead and conduct transformative research aimed at enhancing the safety and dependability of cutting-edge AI systems. Your responsibilities will include designing and executing experiments on large language models, developing the necessary tools for training and evaluating models at scale, and converting your findings into publishable research. You will work closely with CAIS researchers and external partners from academia and industry, utilizing our compute cluster for large-scale model training and evaluation. Your research will focus on critical areas such as AI honesty, robustness, transparency, and the detection of trojan/backdoor behaviors, all aimed at mitigating real-world risks associated with advanced AI technologies.
Full-time|$150K/yr - $275K/yr|On-site|San Francisco
AI Research ScientistAt Substrate, we are tackling a critical technological challenge that impacts the United States. Positioned at the crossroads of advanced manufacturing and innovative physics, our mission is to develop transformative technologies that will revolutionize the semiconductor industry and bolster America's technological dominance. Our team comprises top-tier scientists, engineers, and technical specialists dedicated to pushing the boundaries of technology for the benefit of the nation.As an AI Research Scientist, you will play a key role in enhancing and accelerating research and development processes by harnessing machine learning techniques for scientific simulations and modeling. You will also focus on establishing internal AI capabilities throughout our organization. This position merges cutting-edge physics with artificial intelligence, requiring hands-on development of AI-enhanced tools that facilitate groundbreaking research. You will also contribute to building the infrastructure and expertise required for our technical teams to effectively use AI in their workflows. Whether you are a physicist who has adopted machine learning or an AI expert with a solid scientific background, you will be instrumental in shaping our approach to utilizing AI to expedite our internal R&D efforts.
Oct 28, 2025
Sign in to browse more jobs
Create account — see all 1,349 results
Tailoring 0 resumes…
Tailoring 0 resumes…
We'll move completed jobs to Ready to Apply automatically.