Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Unlock Your Potential
Generate Job-Optimized Resume
One Click And Our AI Optimizes Your Resume to Match The Job Description.
Is Your Resume Optimized For This Role?
Find Out If You're Highlighting The Right Skills And Fix What's Missing
Experience Level
Entry Level
Qualifications
We are looking for candidates with a strong background in performance modeling, data analysis, and software engineering. Ideal candidates will have:Proficiency in programming languages such as Python and C++. Experience with machine learning frameworks and performance optimization techniques. A degree in Computer Science, Engineering, or a related field. Excellent problem-solving skills and the ability to work collaboratively in a fast-paced environment.
About the job
OpenAI is seeking a Performance Modeling Engineer based in San Francisco. This role centers on building and improving models that enhance the performance and efficiency of AI systems. The work directly supports the technical backbone of OpenAI’s products.
Key responsibilities
Develop and refine models aimed at optimizing the performance of AI systems.
Collaborate with engineers and data scientists to tackle technical challenges as they arise.
Contribute to projects that improve the efficiency of large-scale AI infrastructure.
Role overview
This position offers the chance to work on foundational technology that underpins OpenAI’s products. The focus is on practical improvements and close teamwork with technical colleagues to advance the capabilities and efficiency of AI at scale.
About OpenAI
OpenAI is a leading research organization dedicated to advancing artificial intelligence in a safe and beneficial manner. Our mission is to ensure that artificial general intelligence (AGI) benefits all of humanity. Join us to work at the forefront of AI technology and contribute to projects that make a difference.
About the RolePerplexity is seeking a talented Model Behavior Architect to join our innovative AI team in San Francisco. In this role, you will be instrumental in developing and evaluating AI products that enhance user experiences across various domains. Collaborating closely with both research and product teams, you will design strategies for prompt and context engineering that ensure high-quality interactions.This position uniquely blends creativity and analytical skills. You will gain a profound understanding of our answer engine by rigorously testing model capabilities and working with our AI infrastructure, including system prompts, tool prompts, skills, and evaluations, to create an exceptional product experience for our users.As the go-to expert on prompting, model quality, and behavioral consistency, you will be pivotal in the deployment of new product features and model releases.Key ResponsibilitiesContext Engineering: Create, test, and refine context strategies and system prompts that influence answer engine behavior across various products, features, and use cases.Evaluation Systems: Develop automated and semi-automated evaluation pipelines to assess model quality, detect regressions, and scale across product surfaces.Model Launch Support: Collaborate with research and engineering teams to validate model behavior prior to and during rollouts, ensuring seamless transitions without any degradation.Research & Analysis: Identify inconsistencies and potential failure modes in model outputs through meticulously designed research initiatives for both internal and production-facing systems.Cross-functional Collaboration: Work closely with design, product, and research teams to translate product objectives into specific model behavior requirements.Knowledge Sharing: Assist engineers across teams in developing a strong understanding of prompt design, context engineering, and evaluation best practices.Staying Current: Keep abreast of the latest alignment, evaluation, and prompting techniques from both industry and academia, and integrate the best ideas into the team.
Join the Revolution in Behavioral IntelligenceAmplify Your InfluenceYou have achieved remarkable success in your career, creating robust behavioral or neuroscience models that have driven significant outcomes. You possess a talent for discerning patterns in user behavior, comprehending motivations, and optimizing end-to-end user experiences.Now, envision extending your impact across multiple products and organizations, enhancing the entire app ecosystem. Every application at your fingertips becomes smarter, more engaging, and indispensable to its users.Your expertise can empower product teams to innovate more rapidly, delight users, and boost revenue, all thanks to the behavioral intelligence you develop once and deploy universally.We share this vision: our team has accomplished this repeatedly at industry leaders like Uber, Apple, Google, and Chime, generating tens of billions of dollars in value for products vital to billions globally. We are poised to elevate our impact even further.Does this resonate with the next chapter you're seeking? If so, continue reading.Palladio: Pioneering BreakthroughsPalladio AI is an innovative AI platform aimed at transforming product-led growth and enhancing the value our clients provide in users’ daily lives.Our initial focus is on mobile gaming, where development is swift, user engagement is high, and experimentation yields immediate results—making it the perfect testing ground for our platform.Your ContributionsOur team is constructing foundational systems in behavioral modeling, causal inference, forecasting, and agentic platforms. You will play a pivotal role in extending these areas: creating machine learning and AI-driven behavioral models to identify and highlight product opportunities while deploying self-improving learning loops with each iteration. Your work will analyze user sentiments, thoughts, decisions, and actions—translating behavioral insights into opportunities that enhance product intuitiveness, engagement, and rewards. In essence, you will convert first-principles data science, neuroscience, cognitive science, and machine learning into scalable solutions across various industries.Your ProfileUser-Focused. You empathize with users' challenges, needs, and goals throughout their journeys, measure success through user outcomes, and convert insights into innovative and engaging product experiences.Scientific Innovator. You...
Join usm2 as a Senior Data Modeler / Data Architect with expertise in Big Data and Hadoop. In this pivotal role, you will harness the power of data to drive strategic decisions and enhance business outcomes. Your experience with data modeling and architecture will be essential in building robust data ecosystems.
Zyphra is an innovative artificial intelligence company located in the heart of San Francisco, California.The Opportunity:Join our dynamic team as a Research Engineer - Audio & Speech Models, where you will play a pivotal role in advancing Zyphra’s Audio Team. You will be instrumental in developing cutting-edge open-source text-to-speech and audio models. Your contributions will span the full spectrum of the model training process, from data collection and processing to the design of innovative architectures and training approaches.Your Responsibilities:Conduct large-scale audio training operationsOptimize the performance of our training infrastructureCollect, process, and evaluate audio datasetsImplement architectural and methodological improvements through rigorous testingWhat We Seek:A strong research mindset with the ability to navigate projects from ideation to implementation and documentation.Proficiency in rapid prototyping and implementation, allowing for swift experimentation.Effective collaboration skills in a fast-paced research environment.A quick learner who is eager to embrace and implement new concepts.Excellent communication abilities, enabling you to contribute to both research and engineering tasks at scale.Preferred Qualifications:Expertise in training audio models, such as text-to-speech, ASR, speech-to-speech, or emotion recognition.Experience with training audio autoencoders.Solid understanding of signal processing, particularly in audio.Familiarity with diffusion models, consistency models, or GANs.Experience with large-scale (multi-node) GPU training environments.Strong understanding of experimental methodologies for conducting rigorous tests and ablations.Interest in large-scale, parallel data processing pipelines.Competence in PyTorch and Python programming.Experience contributing to large, established codebases with rapid adaptation.
Full-time|$189.5K/yr - $236.9K/yr|Remote|San Francisco, CA (Remote)
Earnest is dedicated to empowering ambitious individuals to make informed financial decisions and create the lives they aspire to lead.Our team, known as Earnies, is passionate about providing borrowers with smarter borrowing solutions that offer a clearer path toward financial empowerment. If you share our enthusiasm for this mission, we invite you to explore the details below and join us in building something exceptional.The Senior Model Risk Manager will report directly to the Head of Credit Risk.In this role, you will:Take ownership of and enhance Earnest’s Model Risk Management framework, ensuring that our credit, loss forecasting, fraud, marketing, and finance models are robust, transparent, and scalable.Conduct independent end-to-end model validations, from conceptual soundness and data quality to performance monitoring and implementation review, providing constructive feedback to modeling teams.Collaborate closely with Data Science and Risk leaders early in the model design process to refine assumptions, enhance methodologies, and uplift modeling standards throughout the organization.Supervise model performance monitoring and proactively identify emerging risks, performance drift, or control deficiencies, ensuring timely and effective remediation.Produce clear, decision-ready validation reports and effectively communicate technical findings to drive impactful business outcomes and sound risk management decisions.Act as a trusted advisor on model governance, enabling Earnest to operate swiftly while maintaining the necessary discipline and controls of a leading lending platform.
At Hover, we empower individuals to create, enhance, and safeguard the spaces they cherish. Utilizing cutting-edge AI technology built on over a decade of comprehensive real estate data, Hover addresses essential inquiries such as “What will it look like?” and “What will it cost?” Homeowners, contractors, and insurance professionals depend on Hover for precise, fully-measured, and interactive 3D models of any property, all achievable through a quick smartphone scan.Our mission at Hover is fueled by curiosity, purpose, and a collective dedication to our customers, communities, and one another. We believe that innovation thrives through diverse perspectives and are proud to foster an inclusive, high-performance culture that encourages growth, accountability, and excellence. Supported by reputable investors like Google Ventures and Menlo Ventures, and trusted by industry frontrunners such as Travelers, State Farm, and Nationwide, we are transforming the way people perceive and engage with their environments.Why Hover Wants YouWe are establishing a new Solutions Architect role to support large enterprise clients in effectively adopting and scaling our platform. This pivotal, technical, customer-facing leadership position will oversee intricate onboarding and integration processes, translating client workflows into actionable platform configurations and technical specifications, while closely collaborating with Product and Engineering teams to bridge any gaps.As the inaugural hire and player-coach team lead, you will set the framework for Solutions Architecture within our organization and will recruit and nurture a team that will expand alongside growing demand.Your ContributionsEnterprise Onboarding & Solution DesignGuide comprehensive technical onboarding for major enterprise clients: discovery, solution design, and readiness for launch.Align client workflows and operational models with our products, pinpointing necessary configurations, best practices, and change management requirements.Create scalable solutions that balance client needs, product capabilities, security/compliance mandates, and long-term sustainability.Integrations & Technical DeliveryLead integration strategy and execution across client systems and downstream partners (e.g., APIs, SSO, data transfer, third-party tools).Convert requirements into clear technical specifications: data mappings, integration contracts, event flows, error-handling approaches, and monitoring strategies.Develop repeatable integration patterns and documentation that minimize implementation time and enhance reliability.
Full-time|$250K/yr - $325K/yr|On-site|San Francisco
About World Labs: At World Labs, we create foundational world models capable of perceiving, generating, reasoning, and interacting with the 3D environment. Our mission is to unlock the full potential of AI through spatial intelligence, transforming perception into action, reasoning into insight, and imagination into creation. We believe that spatial intelligence will revolutionize storytelling, creativity, design, simulation, and immersive experiences across both virtual and physical realms. Our world-class team is driven by curiosity and passion, boasting diverse backgrounds in technology, from AI research and systems engineering to product design. This synergy fosters a tight feedback loop between our cutting-edge research and user-empowering products. Role Overview We are seeking an innovative Research Scientist specializing in generative modeling, especially diffusion models, to join our modeling team. This position is ideal for individuals with extensive expertise in applying diffusion models to images, videos, or 3D assets and scenes. While not mandatory, experience in any of the following areas will be considered a significant advantage: Large-scale model trainingResearch in 3D computer vision In this role, you will work closely with researchers, engineers, and product teams to translate advanced 3D modeling and machine learning techniques into practical applications, ensuring our technology stays at the forefront of visual innovation. This position entails substantial hands-on research and engineering work, taking projects from conception to production deployment. Key Responsibilities Design, implement, and train large-scale diffusion models for generating 3D worlds. Develop and experiment with large-scale diffusion models to introduce novel control signals, align with target aesthetic preferences, or optimize for efficient inference. Collaborate closely with research and product teams to comprehend and translate product requirements into actionable technical roadmaps. Contribute actively to all phases of model development, including data curation, experimentation, evaluation, and deployment. Continuously investigate and integrate the latest research in diffusion and generative AI. Serve as a key technical resource within the team, mentoring peers and promoting best practices in generative modeling and machine learning engineering.
Full-time|$160K/yr - $230K/yr|On-site|San Francisco
About MeterAt Meter, we believe that networking is at the heart of technological advancement. We have innovatively unified the entire networking stack and are now on a mission to make it autonomous.Our team is developing a cutting-edge neural network-driven system designed to analyze raw computer networks, enabling us to address all networking challenges. As outlined on Meter.ai, we are creating models within a closed-loop system that utilizes real-time telemetry, logs, and network events to autonomously troubleshoot issues, enhance performance, and resolve challenges.To achieve this, we require not only exceptional models but also robust infrastructure that ensures our models have clean, versioned, and low-latency access to the necessary data throughout training, evaluation, and deployment phases.Why this Role is EssentialEach Meter network deployed in the field serves as a valuable data source for our Models team. However, without meticulous infrastructure design, this data risks becoming fragmented, outdated, or inconsistent. In this role, you will ensure that such pitfalls are avoided. You will be responsible for the core data interface that drives our model development, experimentation, evaluation, and real-time inference.This position is fundamental and offers a significant impact. Your contributions will shape the speed at which we can train new models, the reliability of their evaluations, and their seamless operation across hundreds of real-world networks. You will collaborate closely with modelers to deliver systems that are elegant, scalable, and robust.Your ResponsibilitiesDesign and implement the Models API: a unified interface for accessing training, evaluation, and deployment data across raw, transformed, and feature-engineered layers.Ensure backward compatibility and feature versioning across continually evolving schemas.Develop scalable pipelines to ingest, transform, and serve petabytes of data across Kafka, Postgres, and Clickhouse.Create CI/CD workflows that evolve the API in tandem with changes to the underlying data schema.Facilitate fine-grained querying of historical and real-time data for any network, at any point in time.Help establish and promote the principle of 'smart data, dumb functions': maximizing operations in the data layer to minimize downstream code complexity.Collaborate with modelers to co-design training frameworks that optimize performance.
About the TeamThe Preparedness team plays a crucial role within the Safety Systems organization at OpenAI, adhering to our Preparedness Framework.While frontier AI models promise to bring significant benefits to humanity, they also introduce substantial risks. The Preparedness team is dedicated to ensuring that the development of advanced AI models fosters positive outcomes. Our mission includes identifying, monitoring, and preparing for catastrophic risks associated with these technologies.Key Mission Objectives:Monitor and predict the evolving capabilities of frontier AI systems to identify misuse risks that could significantly impact society.Establish concrete procedures, infrastructure, and partnerships to mitigate these risks and ensure the safe development of powerful AI systems.This fast-paced and impactful role connects capability assessment, evaluations, internal red teaming, and mitigations for frontier models, facilitating coordination on AGI preparedness.About the RoleAs a Threat Modeler, you will spearhead OpenAI's comprehensive approach to identifying, modeling, and forecasting risks from frontier AI systems. Your work will ensure that our evaluation frameworks, safeguards, and classifications are robust, comprehensive, and future-focused. You will help articulate the rationale behind our most stringent risk-prevention strategies, influencing prioritization and mitigation across various domains. This position acts as a central hub, integrating technical, governance, and policy considerations regarding our approach to frontier AI risks.Key ResponsibilitiesDevelop and maintain comprehensive threat models across various misuse areas (biological, cyber, attack planning, etc.).Create plausible threat models addressing loss of control, self-improvement, and other potential risks associated with alignment from frontier AI systems.Forecast risks by merging technical foresight, adversarial simulation, and current trends.Collaborate closely with technical partners on capability evaluations and risk assessments.
OpenAI is seeking a Performance Modeling Engineer based in San Francisco. This role centers on building and improving models that enhance the performance and efficiency of AI systems. The work directly supports the technical backbone of OpenAI’s products. Key responsibilities Develop and refine models aimed at optimizing the performance of AI systems. Collaborate with engineers and data scientists to tackle technical challenges as they arise. Contribute to projects that improve the efficiency of large-scale AI infrastructure. Role overview This position offers the chance to work on foundational technology that underpins OpenAI’s products. The focus is on practical improvements and close teamwork with technical colleagues to advance the capabilities and efficiency of AI at scale.
Join Cartesia as a Model Architecture ResearcherAt Cartesia, our vision is to revolutionize AI by creating interactive intelligence that is seamlessly integrated into your daily life. Unlike current models, our goal is to develop systems capable of processing extensive streams of audio, video, and text—1 billion text tokens, 10 billion audio tokens, and 1 trillion video tokens—directly on devices.As pioneers in innovative model architectures, our founding team, which originated from the Stanford AI Lab, has developed State Space Models (SSMs)—a groundbreaking foundation for training efficient, large-scale models. Our diverse team merges deep expertise in model innovation with a design-focused engineering approach, allowing us to create and deploy state-of-the-art models and applications.Backed by leading investors such as Index Ventures, Lightspeed Venture Partners, and many others, including industry veterans and advisors, we are poised to shape the future of AI.Your ContributionIn this role, you will drive forward-thinking research in neural network architecture, focusing on alternative models like state space models, efficient transformers, and hybrid architectures.Create innovative architectures that enhance model performance, inference speed, and adaptability in various environments, from cloud infrastructures to on-device implementations.Develop advanced capabilities for models, including statefulness, long-range memory, and novel conditioning mechanisms to boost expressiveness and generalization.Analyze architectural decisions and their effects on model characteristics such as scalability, robustness, latency, and energy consumption.Create frameworks and tools to assess architectural advancements, benchmarking their performance in both research and production contexts.Collaborate with interdisciplinary teams to translate architectural insights into scalable systems that deliver real-world impact.Your QualificationsExtensive experience in architecture design with a focus on advanced models such as state space models, transformers, and RNN/CNN variants.In-depth understanding of the interplay between architectural designs and system constraints, particularly in cloud and on-device deployments.Strong proficiency in the design and evaluation of neural network architectures.
Role overview The Performance Modeling Lead at OpenAI works from San Francisco and takes on both technical and leadership responsibilities. This position centers on developing new modeling methods that enhance performance across a variety of applications. Alongside direct technical contributions, the role involves guiding a team and shaping project direction. What you will do Develop and improve modeling strategies to raise performance metrics for multiple projects. Use expertise in data analysis, machine learning, and optimization to address complex problems. Lead and mentor a team, supporting their technical development and ensuring strong project outcomes.
AI Financial Modeling Extern — F2 AILocation: San Francisco, CA / In-Person or RemoteCommitment: 5+ hours per week | 4 - 12+ weeksCompensation: $50/hrAbout F2 AIAt F2 AI, we are revolutionizing private market investments. Our cutting-edge AI technology streamlines the process of analyzing complex, unstructured deal materials, transforming them into actionable, investment-grade insights in mere minutes. By empowering private credit funds, commercial banks, and private equity firms, we enable faster and more confident capital deployment. Supported by top-tier investors such as NFX and Y Combinator, we are committed to expanding our exceptional product and engineering teams, shaping the future of vertical AI for finance.Role OverviewWe are on the lookout for 1–2 exceptional externs with a strong foundation in Investment Banking or Private Equity to contribute to the development of AI-driven financial modeling on the F2 platform.In this role, you will collaborate closely with our Engineering, Product, and Design (EPD) teams in the San Francisco office to translate institutional-level financial modeling standards into automated, intelligent workflows. This hands-on experience will allow you to shape the future of AI in financial modeling.Key ResponsibilitiesEducate F2 agents on best practices for financial modeling.Create and standardize financial modeling templates optimized for AI execution using a first principles approach.Establish formatting, structure, and best practices that align with institutional modeling standards.Conduct rigorous quality assurance on AI-generated outputs to guarantee precision that meets investor expectations.Test edge cases and assist in identifying potential failures in automated modeling workflows.Ideal Candidate ProfilePossess prior experience in Investment Banking, Private Credit, or Private Equity with extensive exposure to financial modeling.Demonstrated ability to build and audit complex 3-statement, LBO, or credit models from the ground up.Strong understanding of model hygiene, structure, and institutional formatting standards.Critical thinker who enjoys analyzing model logic and stress-testing systems.Passionate about leveraging AI to enhance financial workflows.
Employee Applicant Privacy NoticeAbout Us:Join us in shaping a brighter financial future.At SoFi, we are transforming how individuals engage with their personal finances. As a next-generation financial services company and national bank, we leverage cutting-edge, mobile-first technology to empower millions of members in achieving their financial goals. Amidst significant industry changes, we are proud to lead the way and positively impact lives through our core values. Become a part of our journey to invest in yourself, your career, and the financial landscape.The Role:We are looking for a Fraud Model Developer to join our Lending Business Insights and Data Science team. This role will involve collaborating closely with SoFi’s Product, Engineering, and Finance teams.The successful candidate will work to enhance our application funnels through experimentation, analyze product usage across various member segments, and facilitate data-driven strategic decision-making. We are seeking an individual who is innovative, hands-on, possesses a strong sense of ownership, and is dedicated to delivering results while also mentoring and developing team members. We value a commitment to continuous learning and growth.
Role OverviewAt Mariana Minerals, we are on a mission to revolutionize refining processes for critical minerals, playing a pivotal role in the global energy transition. We are in search of a dynamic and driven Process Modeling Engineer who will be integral to this endeavor.In this position, you will take charge of developing, validating, and optimizing heat and material balance models utilizing advanced software such as ASPEN Plus/HYSYS, SysCAD, OLI Studio, or METSIM. You will collaborate closely with R&D, pilot operations, and project execution teams to transform lab and pilot data into robust, scalable process models that are essential for the design of groundbreaking mineral refining facilities.Key ResponsibilitiesCreate both steady-state and dynamic process models to determine heat and material balances for integrated mineral refinery systems using ASPEN, SysCAD, OLI, or METSIM.Automate the sizing of equipment and processes (including reactors, heat exchangers, filters, crystallizers, evaporators, and separators) based on model outputs, linking models to datasheets and other engineering tools.Develop and maintain comprehensive process simulation databases to ensure consistency and traceability among modeling assumptions, test data, and engineering outputs.Calibrate and reconcile models using operational data from pilot plants to ensure model accuracy and predictive validity.Conduct optimization studies to enhance energy recovery, recycling strategies, and material efficiency.Develop dynamic models for validating PLC and DCS programming while assessing buffer sizing throughout the design process.Integrate process models with CAPEX and OPEX estimation tools to streamline techno-economic model development.Document modeling methodologies and results, ensuring clear technical communication for design reviews, techno-economic assessments, and regulatory submissions.
We are seeking an innovative and experienced .NET Architect to join our team in San Francisco. As a .NET Architect, you will play a critical role in designing and implementing robust software solutions that meet our clients' needs. You will work closely with development teams to ensure architectural integrity and optimal performance of applications.
Zyphra is a cutting-edge artificial intelligence firm headquartered in the vibrant city of San Francisco, California.Position Overview:As a Research Scientist specializing in Model Architectures, you will play a pivotal role in Zyphra’s AI Architecture Research Team. Your responsibilities will include the design and thorough evaluation of innovative model architectures and training methodologies aimed at enhancing essential modeling capabilities (e.g., loss per flop or loss per parameter) and tackling core limitations inherent in current models. You will collaborate closely with our pre-training team to ensure that your findings are seamlessly integrated into our next-generation models.Qualifications:A strong research acumen and intuition.Proven ability to navigate research projects from initial conception to execution and final write-up.Exceptional implementation and prototyping skills, with the capability to swiftly transform ideas into experimental outcomes.A collaborative spirit and the ability to thrive in a fast-paced research environment.A deep curiosity and enthusiasm for understanding intelligence.Requirements:Experience with long-term memory, RAG/retrieval systems, dynamic/adaptive computation, and alternative credit assignment strategies.Knowledge of reinforcement learning, control theory, and signal processing techniques.A passion for exploring and critically evaluating unconventional ideas, with the ability to maintain a unique perspective.Familiarity with modern training pipelines and the hardware necessities for designing efficient architectures compatible with GPU hardware.Strong understanding of experimental methodologies for conducting rigorous ablations and hypothesis testing.High proficiency in PyTorch and Python programming.Ability to quickly assimilate into large pre-existing codebases and contribute effectively.Prior publication of machine learning research in reputable venues.Postgraduate degree in a scientific discipline (e.g., Computer Science, Electrical Engineering, Mathematics, Physics).Why Join Zyphra?We emphasize a structured research methodology that systematically addresses ambitious challenges in AI.
ABOUT BASETENAt Baseten, we are at the forefront of AI innovation, providing critical inference solutions for leading AI companies like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma, and Writer. Our platform combines advanced AI research, adaptable infrastructure, and intuitive developer tools, empowering organizations to deploy state-of-the-art models effectively. With rapid growth and a recent $300M Series E funding round backed by top-tier investors including BOND, IVP, Spark Capital, Greylock, and Conviction, we invite you to join our mission in building the platform of choice for engineers delivering AI products.THE ROLE:As a member of Baseten’s Model Performance (MP) team, you will play a pivotal role in ensuring our platform’s model APIs are not only fast and reliable but also cost-effective. Your primary focus will be on developing and optimizing the infrastructure that supports our hosted API endpoints for cutting-edge open-source models. This role involves working with distributed systems, model serving, and enhancing the developer experience. You will collaborate with a small, dynamic team at the intersection of product development, model performance, and infrastructure, defining how developers interact with AI models on a large scale.RESPONSIBILITIES:Design, develop, and maintain the Model APIs surface, focusing on advanced inference features such as structured outputs (JSON mode, grammar-constrained generation), tool/function calling, and multi-modal serving.Profile and optimize TensorRT-LLM kernels, analyze CUDA kernel performance, create custom CUDA operators, and enhance memory allocation patterns for maximum efficiency across multi-GPU setups.Implement performance improvements across various runtimes based on a deep understanding of their internals, including speculative decoding, guided generation for structured outputs, and custom scheduling algorithms for high-performance serving.Develop robust benchmarking frameworks to evaluate real-world performance across diverse model architectures, batch sizes, sequence lengths, and hardware configurations.Enhance performance across runtimes (e.g., TensorRT, TensorRT-LLM) through techniques such as speculative decoding, quantization, batching, and KV-cache reuse.Integrate deep observability mechanisms (metrics, traces, logs) and establish repeatable benchmarks to assess speed, reliability, and quality.
About Our TeamJoin the Inference team at OpenAI, where we leverage cutting-edge research and technology to deliver exceptional AI products to consumers, enterprises, and developers. Our mission is to empower users to harness the full potential of our advanced AI models, enabling unprecedented capabilities. We prioritize efficient and high-performance model inference while accelerating research advancements.About the RoleWe are seeking a passionate Software Engineer to optimize some of the world's largest and most sophisticated AI models for deployment in high-volume, low-latency, and highly available production and research environments.Key ResponsibilitiesCollaborate with machine learning researchers, engineers, and product managers to transition our latest technologies into production.Work closely with researchers to enable advanced research initiatives through innovative engineering solutions.Implement new techniques, tools, and architectures that enhance the performance, latency, throughput, and effectiveness of our model inference stack.Develop tools to identify bottlenecks and instability sources, designing and implementing solutions for priority issues.Optimize our code and Azure VM fleet to maximize every FLOP and GB of GPU RAM available.You Will Excel in This Role If You:Possess a solid understanding of modern machine learning architectures and an intuitive grasp of performance optimization strategies, especially for inference.Take ownership of problems end-to-end, demonstrating a willingness to acquire any necessary knowledge to achieve results.Bring at least 5 years of professional software engineering experience.Have or can quickly develop expertise in PyTorch, NVidia GPUs, and relevant optimization software stacks (such as NCCL, CUDA), along with HPC technologies like InfiniBand, MPI, and NVLink.Have experience in architecting, building, monitoring, and debugging production distributed systems, with bonus points for working on performance-critical systems.Have successfully rebuilt or significantly refactored production systems multiple times to accommodate rapid scaling.Are self-driven, enjoying the challenge of identifying and addressing the most critical problems.
Join Hive as a Product Manager on our Hive Models team, where you will collaborate with cross-functional stakeholders to define product requirements and oversee the implementation of cutting-edge AI models. You will lead development initiatives among Machine Learning, Core Infrastructure, and Product teams, playing a pivotal role in our cloud-hosted deep learning solutions. As a key player in our organization, you will not only support our existing clients but also drive innovation and growth within our product offerings.
Jun 22, 2021
Sign in to browse more jobs
Create account — see all 389 results
Tailoring 0 resumes…
Tailoring 0 resumes…
We'll move completed jobs to Ready to Apply automatically.