Ai Architecture Intern Inference jobs in San Jose – Browse 207 openings on RoboApply Jobs

Ai Architecture Intern Inference jobs in San Jose

Open roles matching “Ai Architecture Intern Inference” with location signals for San Jose. 207 active listings on RoboApply Jobs.

207 jobs found

1 - 20 of 207 Jobs
Apply
companyEtched logo
Internship|On-site|San Jose

AI Architecture Intern - InferenceLocation: San Jose, CA Team: ArchitectureAbout EtchedAt Etched, we are pioneering the development of the world’s first AI inference system specifically designed for transformers, achieving over 10 times the performance and significantly reduced cost and latency compared to traditional systems. Our innovative ASICs empower the creation of groundbreaking products, enabling real-time video generation and advanced reasoning agents that are unattainable with conventional GPUs. Supported by substantial investments from leading venture capital firms and staffed by top-tier engineering talent, Etched is at the forefront of transforming the infrastructure of the fastest-growing industry.The RoleWe are in search of a motivated Architecture Intern to join our dynamic team, contributing to the design and optimization of next-generation AI accelerators. This role will involve developing and fine-tuning compute architectures that deliver outstanding performance and efficiency for transformer workloads. Throughout your internship, you will tackle cutting-edge architectural challenges and engage in performance modeling.Key ResponsibilitiesAssist in adapting state-of-the-art models to our architecture and develop programming abstractions and testing capabilities for rapid model iteration.Help enhance and scale Sohu’s runtime, including multi-node inference, intra-node execution, state management, and robust error handling.Contribute to the optimization of routing and communication layers utilizing Sohu’s collectives.Employ performance profiling and debugging tools to pinpoint bottlenecks and correctness issues.Gain a deep understanding of Sohu to collaboratively design hardware instructions and model architecture operations to maximize performance.Implement high-performance software components for the Model Toolkit.QualificationsCurrently pursuing a Bachelor’s, Master’s, or PhD in Computer Science, Computer Engineering, Applied Mathematics, or a related discipline.Strong programming skills in Python and C++.Familiarity with performance-sensitive or complex distributed software systems, such as Linux internals, accelerator architectures (e.g., GPUs, TPUs), and compilers.

Dec 8, 2025
Apply
companyEtched logo
Full-time|On-site|San Jose

About EtchedEtched is pioneering the world's first AI inference system specifically designed for transformers, achieving over 10x higher performance, significantly reduced costs, and minimal latency compared to B200 systems. Our custom ASICs enable the development of innovative products that were previously unattainable with GPUs, such as real-time video generation models and advanced chain-of-thought reasoning agents. With substantial backing from leading investors and a team of top engineers, Etched is revolutionizing the infrastructure of one of the fastest-growing industries in history.Key ResponsibilitiesAssist in porting cutting-edge models to our architecture and contribute to the development of programming abstractions and testing capabilities to streamline the model porting process.Develop, enhance, and scale Sohu’s runtime, focusing on multi-node inference, intra-node execution, state management, and effective error handling.Optimize routing and communication layers utilizing Sohu's collectives.Employ performance profiling and debugging tools to pinpoint bottlenecks and correctness challenges.Ideal Candidate ProfileStrong proficiency in C++ or Rust programming languages.Solid understanding of performance-critical and complex distributed software systems, including Linux internals, accelerator architectures (e.g., GPUs, TPUs), compilers, and high-speed interconnects (e.g., NVLink, InfiniBand).Familiarity with machine learning frameworks such as PyTorch or JAX.Experience in porting applications to non-standard accelerator hardware or platforms.Preferred QualificationsExperience in developing low-latency, high-performance applications using both kernel-level and user-space networking stacks.In-depth understanding of distributed systems concepts, algorithms, and challenges, including consensus protocols and communication patterns.Thorough knowledge of Transformer architectures, particularly Mixture-of-Experts (MoE).Experience building applications with substantial SIMD (Single Instruction, Multiple Data) optimizations for performance-critical paths.BenefitsHealth Insurance401k

Jun 17, 2025
Apply
companyEtched logo
Full-time|On-site|San Jose

About EtchedEtched is pioneering the development of the world's first AI inference system specifically designed for transformer models, achieving over 10x the performance while significantly reducing costs and latency compared to traditional solutions like the B200. With Etched ASICs, our technology enables the creation of groundbreaking products such as real-time video generation and highly advanced reasoning agents. Supported by substantial investment from leading venture capitalists and comprised of a team of top engineers, Etched is reshaping the infrastructure of one of the fastest-growing sectors in technology.Job SummaryWe are on the lookout for a skilled ASIC Architect to join our architecture team and play a vital role in crafting next-generation AI accelerators. This position is centered on the design and enhancement of computing architectures that excel in performance and efficiency specifically tailored for transformer workloads. You will engage with state-of-the-art architectural challenges and performance modeling while collaborating across various teams to transform innovative chip designs from concept to reality.Key ResponsibilitiesMicroarchitecture & Dataflow Innovation: Design and evaluate chip architectures optimized for AI/ML workloads, emphasizing throughput, latency, and power efficiency.Design Next-Generation Silicon: Assist in developing methodologies for power and area estimation during the early stages of architectural exploration.Custom Circuit Development: Generate architectural specifications and interface definitions for compute blocks and subsystems.System-Level Prototyping: Collaborate with RTL, verification, physical design, and software teams to ensure architectural viability and optimization.Performance Optimization: Execute architectural experiments using cycle-accurate simulators and analytical models.Cross-Functional Collaboration: Aid integration efforts by providing architectural insights and resolving design challenges.

Aug 5, 2025
Apply
companyAstera Labs logo
Internship|On-site|San Jose, CA

Astera Labs (NASDAQ: ALAB) develops rack-scale AI infrastructure, offering advanced connectivity solutions for data centers and AI applications. The company partners with hyperscalers and industry leaders to enable organizations to maximize the potential of modern AI technologies. Its Intelligent Connectivity Platform brings together CXL®, Ethernet, NVLink, PCIe®, and UALink™ semiconductor technologies, integrated with the COSMOS software suite, to support diverse connectivity needs. Astera Labs also delivers custom solutions alongside its standards-based portfolio, helping clients build architectures suited to their specific requirements. More details can be found at www.asteralabs.com. Role overview This AI/ML Intern for Silicon Development Automation role is based in San Jose, CA. The internship centers on developing AI-driven workflows that accelerate silicon chip design. The work involves using generative AI and deep learning to enhance Electronic Design Automation (EDA) processes, with a particular emphasis on frontend design and verification. What you will do Support the creation of AI-powered automation for silicon development, including circuit design, verification, and debugging tasks. Assist in integrating EDA tools with large language model (LLM) capabilities.

Apr 22, 2026
Apply
companyLPA Design Studios logo
Full-time|On-site|San Jose, CA

Be a part of the 2025 AIA Architecture Firm Award Winner dedicated to creating a sustainable future.At LPA, we are a team of designers, engineers, and researchers committed to addressing the critical challenges of our time. Our acclaimed “No Excuses” integrated design philosophy has been recognized by the AIA as “a trailblazer in sustainable, high-performance architecture.” Joining LPA means becoming part of a movement that promotes innovative thinking in design and carbon reduction, built on a foundation of collaboration and inclusivity.We are excited to invite applications for the position of Architecture Design Coordinator within our vibrant San Jose studio. You'll thrive in a culture that champions deep collaboration, technical proficiency, and continuous personal and professional development. Notable projects from our San Jose studio include Agnews Campus, Tide Academy, and Santa Clara Square. Collaborate with multidisciplinary teams from our studios in California and Texas to bring high-performance, community-centered design to fruition.Your Responsibilities:As an Architecture Design Coordinator, you will play an integral role from project inception through to completion, working closely with project teams to:Assist Project Architects, Designers, and Managers with various assignments.Prepare and oversee documentation for schematic design, design development, and construction documents.Develop and implement innovative design and detailing solutions.Provide support in client and project coordination.Aid in contract administration by reviewing submittals and addressing RFIs.Research materials, systems, and construction methods to bolster design objectives.Mentor junior designers while also receiving guidance from firm leadership.What We Provide:At LPA, we prioritize your growth and contributions to our mission of innovative design. Join us to make a tangible impact in the architectural landscape.

Jan 14, 2026
Apply
companyAECOM logo
Full-time|On-site|San Jose

Join AECOM as an Architectural Project Manager, where you'll lead innovative architectural projects that shape the future of urban landscapes. You will be responsible for overseeing project development from inception to completion, ensuring that all designs meet client specifications and regulatory requirements. Your leadership skills will be crucial in coordinating multidisciplinary teams and fostering collaboration to deliver high-quality results.

Mar 26, 2026
Apply
companyLPA Design Studios logo
Full-time|On-site|San Jose, CA

Role Overview LPA Design Studios is seeking a Healthcare Architecture Project Manager in San Jose, CA. This position leads project teams through all phases of healthcare facility design and construction, from early concepts to final delivery. The work centers on creating environments that support healing and wellness. What You Will Do Guide multidisciplinary teams through the full project lifecycle, ensuring goals are met on schedule and within scope. Collaborate closely with clients and stakeholders to develop design solutions that balance function and aesthetics. Manage project workflows and maintain clear communication across all parties involved. Apply expertise in healthcare architecture to shape spaces that serve both patients and staff. Key Qualifications Experience managing healthcare architecture projects from concept through completion. Strong communication skills for client interaction and team leadership. Ability to coordinate with clients, consultants, and internal teams to achieve project objectives.

Apr 15, 2026
Apply
companyKognitos logo
Internship|On-site|San Jose - HQ

About the Internship Kognitos is hiring Software Engineer Interns (AI-Native) for Summer 2026 at our San Jose headquarters. This program is designed for students eager to build real products with AI at the core, not just use AI tools. Interns will work on meaningful projects, contribute production-ready code, and help shape our engineering practices alongside experienced professionals. What You'll Do Work closely with engineers to design, build, and launch new features from concept to deployment. Use AI-native tools like Claude and Cursor to streamline and improve development workflows. Write clean, well-documented code that follows industry best practices. Participate in code reviews and technical discussions to refine solutions. Troubleshoot issues and help optimize system performance. Contribute to projects involving AI, automation, and data systems. Experiment, iterate, and collaborate in a team-focused setting. Who We're Looking For Currently enrolled in a Computer Science or related degree program, or have equivalent experience. Solid grasp of programming fundamentals (Python, JavaScript, or similar languages). Comfortable with modern development tools and workflows. Experience using Git and collaborative coding practices. Understanding of core computer science concepts like data structures and algorithms. Strong problem-solving skills and a willingness to learn. Clear communicator who works well in a team. Preferred Qualifications Experience with AI coding tools such as Claude Code, Cursor, or Copilot. Hands-on work with LLMs, APIs, or automation workflows. Familiarity with web technologies like React or Node.js. Exposure to cloud platforms (AWS, GCP, Azure). Previous internships, especially at startups, or substantial personal project experience. Final Note Interested candidates are encouraged to apply even if not every qualification is met.

Apr 16, 2026
Apply
companyLPA Design Studios logo
Structural Design Intern

LPA Design Studios

Intern|On-site|San Jose, CA

Join a pioneering firm recognized by the AIA as a leader in sustainable and high-performance architecture. Recipient of the AIA 2025 Firm Award, LPA Design Studios comprises a dynamic team of designers and researchers dedicated to eliminating carbon emissions and fostering a more equitable and livable future.We are currently looking for an enthusiastic Structural Design Intern to join our collaborative team of architects, engineers, interior designers, landscape architects, and master planners. As an acclaimed multidisciplinary design firm, we prioritize innovative and sustainable design practices. Our structural engineering group has received prestigious awards from the American Institute of Steel Construction and the National Council of Structural Engineers Associations (NCSEA), acknowledging the most innovative projects globally.In this role, you will collaborate with our structural engineering team on a diverse array of projects, including educational, recreational, healthcare, and performing arts facilities. Our award-winning initiatives focus on creating a positive and lasting environmental, economic, and social impact. Learn more about our remarkable projects at LPA.You will be part of a high-achieving, multidisciplinary design team, contributing to projects at every design stage. We provide extensive educational and mentoring opportunities, including software training, tech talks, and monthly LPA-U courses that emphasize innovative and sustainable design practices. Join a firm where your ideas are valued, creativity is encouraged, and your contributions are esteemed.

Oct 17, 2025
Apply
companyEtched logo
Internship|On-site|San Jose

GTM InternLocation: San Jose, CA Team: Go-To-Market (GTM)About EtchedAt Etched, we are pioneering the world’s first AI inference system specifically designed for transformers, achieving more than 10x higher performance while significantly reducing cost and latency compared to traditional B200 systems. With our custom ASICs, we enable the development of groundbreaking products, such as real-time video generation models and highly sophisticated reasoning agents. Supported by substantial investments from leading firms and driven by a team of top engineers, Etched is at the forefront of transforming the infrastructure layer in the rapidly advancing AI industry.Our team operates on-site in San Jose, CA, working together five days a week to foster collaboration and innovation.Role OverviewWe are seeking a Go-To-Market Intern to assist in establishing the operational framework for our GTM initiatives. In this role, you will actively participate in product launches, manage deal logistics, conduct performance assessments, create customer engagement materials, and support strategic partnership development.Key ResponsibilitiesConduct research on model updates, data center advancements, and vendor newsCollaborate with executives across recruitment, operations, and engineering to facilitate customer evaluations and analysesCoordinate cross-functionally with legal, architecture, and software teams to streamline GTM effortsCreate impactful presentations, benchmarks, and ROI/TCO modelsDevelop and oversee our GTM operations engine, including CRM, pipeline tracking, reporting, and lead generation infrastructureSupport product launches through operational planning, communication strategies, and competitive analysisMonitor and report on GTM KPIs, OKRs, and overall engagement metricsEnhance our systems for efficiency: iterate quickly, minimize obstacles, and improve execution speedQualificationsYou might be an ideal candidate if you possess:Prior experience in a startup environment, particularly in operations, GTM strategy, or chief-of-staff rolesA strong interest in AI infrastructure, deep technology, or semiconductor industriesOperational discipline coupled with a knack for storytelling and customer insightThe ability to work across diverse functions, from engineering to product management

Dec 8, 2025
Apply
companyWestern Digital Corporation logo
Internship|On-site|San Jose

Western Digital Corporation is seeking interns for Summer 2026 to join the IT Manufacturing Architecture team in San Jose. This internship offers practical experience working with professionals who design and maintain IT systems that support manufacturing operations. Role overview Interns will participate in projects that contribute directly to manufacturing architecture initiatives. The team environment encourages collaboration with experienced IT staff, allowing interns to play a role in solutions for advanced manufacturing settings. What you will do Support projects related to manufacturing architecture Work closely with IT professionals on daily tasks and project work Help develop and refine solutions for use in manufacturing environments Gain exposure to current technologies and practices in IT manufacturing What you will gain Direct experience with IT and manufacturing systems Familiarity with the latest tools and industry methods Opportunities to develop technical skills and collaborate within a team Experience making a real impact on projects used in manufacturing

Apr 23, 2026
Apply
companyEtched logo
Internship|On-site|San Jose

About EtchedEtched is pioneering the development of the globe's first AI inference system uniquely designed for transformers, achieving over tenfold improvements in performance while significantly reducing cost and latency compared to traditional systems like the B200. With Etched ASICs, you can create groundbreaking products previously deemed impossible with GPUs, including real-time video generation models and highly intricate chain-of-thought reasoning agents. Supported by substantial investment from premier investors and a team of top engineers, Etched is revolutionizing the foundational infrastructure for the fastest-expanding industry in history.Job SummaryAs a Design Verification Intern, you will play a critical role in ensuring the reliability and performance of our custom IPs that drive our chips, including systolic arrays, DMA engines, and NoCs. This position requires innovation, strong technical skills, and a proactive approach to complex verification challenges. You will work closely with architects, RTL designers, and software/firmware/emulation teams to validate the correctness and performance across the entire hardware-software stack.QualificationsCurrently pursuing a Bachelor’s, Master’s, or PhD in electrical engineering, computer engineering, or a relevant field.Knowledge of high-speed digital logic.Familiarity with ASIC or SoC design principles.Experience with SystemVerilog, UVM, or Python.Understanding of verification processes and test bench development.Acquainted with physical design flows and related tools.Eager to quickly learn about transformers and various aspects of modern artificial intelligence.Preferred QualificationsExperience with transformer models and machine learning.Familiarity with UVM or formal verification techniques.Proficiency in Python or similar scripting languages.We encourage all candidates to apply, even if they do not meet every single qualification.Program Details12-week paid internship running from June to August 2026.Generous housing assistance for those relocating.

Feb 7, 2026
Apply
companyEtched logo
Full-time|On-site|San Jose

About EtchedAt Etched, we are pioneering an innovative AI inference system specifically designed for transformer architectures, achieving over ten times the performance while significantly reducing costs and latency compared to traditional solutions. Our cutting-edge ASIC technology enables the development of groundbreaking products, including real-time video generation models and sophisticated reasoning agents capable of deep, parallel thought processes. With substantial backing from leading investors and a team of top-tier engineers, Etched is at the forefront of redefining the infrastructure for the rapidly expanding AI industry.As we continue to grow, we seek a dedicated Recruiting Coordinator to enhance our talent acquisition efforts. This position offers a unique chance to influence the trajectory of a company poised to rival industry giants like NVIDIA.Key Responsibilities:Interview Scheduling & Logistics:Facilitate end-to-end interview scheduling for diverse hiring requirements.Oversee organizational logistics for onsite interviews.Welcome candidates onsite, ensuring a positive experience.Serve as the primary planner for scheduling and interviews.Candidate Experience:Foster a seamless and positive experience for candidates throughout the process.Maintain clear and professional communication with candidates, both written and verbal.Coordinate candidate travel and reimbursement, both domestic and international.Collaborate with Executive Operations to enhance the experience for senior and executive candidates.Operational Support:Collaborate with recruiters, sourcers, and hiring managers to optimize processes.Utilize tools such as Ashby, G-Suite, Notion, and Slack effectively.Address and resolve operational challenges in recruiting through data analysis and collaboration.Adapt to a dynamic and evolving work environment.Representative Projects:Overseeing the scheduling of a high-volume interview pipeline.Coordinating intricate travel arrangements for international candidates.Troubleshooting and resolving scheduling conflicts efficiently.Improving onboarding procedures for new hires.

Feb 25, 2026
Apply
companyEtched logo
Internship|On-site|San Jose

About EtchedAt Etched, we are pioneering the first AI inference system specifically designed for transformers, achieving over 10x higher performance and significantly reduced costs and latency compared to a B200. With our custom ASICs, we empower the creation of groundbreaking products that surpass the capabilities of traditional GPUs, such as real-time video generation models and advanced reasoning agents. Supported by substantial investments from leading venture capitalists and staffed by top talent in the field, Etched is at the forefront of transforming the infrastructure layer for the fastest-growing industry in history.Job SummaryAs a Firmware Intern, you will contribute to the development of firmware for our custom-designed ASICs, aimed at efficiently running large transformer models. Your work will span across various levels, from low-level drivers and hardware interfaces to system initialization and integration with runtime libraries and model-execution frameworks. The focus will be on ensuring that our hardware operates reliably and performs optimally, facilitating high-throughput inference and training workloads. You will collaborate with hardware, architecture, and software teams to validate new silicon features and support real-world AI applications.You may be a great fit if you possessProgress towards a Bachelor's, Master's, or PhD in Computer Science, Engineering, or a related field.Proficiency in C/C++ or Rust programming languages.Solid understanding of data structures and algorithms.Strong grasp of low-level software engineering principles.Familiarity with hardware/software co-design processes.Excellent communication and teamwork abilities.Preferred qualifications (Nice to have)Hands-on experience with Linux internals, kernel development, or driver debugging.Experience in hardware diagnostics or log interpretation.Familiarity with server virtualization or CI/CD pipelines.Experience in embedded development using Rust.We encourage you to apply even if you do not meet every qualification listed.Program Details12-week paid internship (June - August).

Feb 7, 2026
Apply
companyEtched logo
Internship|On-site|San Jose

About EtchedEtched is pioneering the world’s first AI inference system specifically designed for transformers, achieving over 10 times the performance with significantly reduced costs and latency compared to traditional models. Our cutting-edge ASIC technology enables the creation of products that are unachievable with GPUs, including real-time video generation models and advanced reasoning agents. With substantial backing from top-tier investors and a team of exceptional engineers, Etched is transforming the infrastructure landscape for the rapidly evolving AI industry.Job SummaryAs a mechanical/thermal engineering intern, you will become an integral part of our mechanical and thermal engineering team, focusing on New Product Introduction (NPI) prototyping, manufacturing fixture development, liquid cooling technologies, and rack-level system design. Our Platform Team is responsible for developing a comprehensive full-stack system that supports Etched Silicon from PCB to Rack. We are looking for interns eager to engage in all facets of mechanical and electrical engineering to help shape the future of high-power ASIC systems.Ideal Candidates Will Have:Progress towards a Bachelor’s or Master’s degree in Mechanical Engineering, Computer Engineering, or a related field.Hands-on experience in building mechanical systems, circuit boards, or similar projects.Familiarity with the board design cycle, high-speed interconnects, power distribution, and system integration.We encourage applications from candidates who may not meet every single qualification.Internship Program Details:12-week paid internship (June - August 2026)Generous housing support for interns relocating to San JoseDaily lunch and dinner provided at our officePosition based in our San Jose, CA officeDirect mentorship from industry leaders and top-tier engineersOpportunity to contribute to addressing some of the most critical challenges of our timeFor inquiries, please contact internships@etched.com

Feb 7, 2026
Apply
companyLPA Design Studios logo
Full-time|On-site|San Jose, CA

LPA Design Studios is on the lookout for a dedicated Entry Level Landscape Architectural Designer to become a vital part of our collaborative team, which includes Architects, Engineers, Interior Designers, Landscape Architects, and Master Planners. You will have the chance to engage with colleagues across various studios on both local and statewide projects in California and Texas. Our diverse clientele ranges from public to private entities, spanning nine different market segments, offering you a multitude of career paths.As a member of a high-achieving, multi-disciplinary design team, your contributions will influence projects at every design phase. We are committed to fostering your professional growth through extensive educational and mentoring opportunities, including software training, technology discussions, and monthly LPA-U courses focused on innovative and sustainable design practices. Join us to have your voice heard, where your creative design ideas are welcomed, and your contributions are valued.Our commitment to diversity, wellness, and work-life balance is prominently reflected in our Just label. We offer competitive salaries and a wealth of benefits, including health and dental insurance, retirement plans, and wellness programs to support your work-life balance.

Jan 8, 2026
Apply
companyEtched logo
Full-time|$150K/yr - $275K/yr|On-site|San Jose

About EtchedEtched is at the forefront of innovation, creating the world’s first AI inference system specifically designed for transformers. Our technology delivers over 10x the performance and significantly reduces cost and latency compared to traditional systems like the B200. With our advanced ASICs, we empower the development of groundbreaking products including real-time video generation models and highly sophisticated chain-of-thought reasoning agents. Supported by substantial investment from top-tier VCs and staffed by a team of elite engineers, Etched is reshaping the infrastructure for the fastest-growing industry in history.Job SummaryWe are on the lookout for a driven and detail-oriented Supercomputing Engineer (Test) to join our dynamic team. This integral position is crucial for maintaining the reliability and stability of our high-performance inference server hardware and software. In this role, you will design, develop, and execute comprehensive burn-in test suites, analyze test results, and collaborate closely with both hardware and software engineering teams at Etched and our ODM partners to swiftly identify and rectify potential issues. You will play a vital role in ensuring that our server products uphold the highest quality standards before reaching our valued customers.Key ResponsibilitiesTest Development: Craft, develop, and implement automated burn-in test suites utilizing common scripting languages (Python, Go, Bash) and testing frameworks, covering all facets of System Operation including boot sequences, root-of-trust, system management, workload deployment, and performance.Test Execution: Conduct burn-in tests on server hardware, monitor system performance and health, and interpret test results.Failure Analysis: Delve into and troubleshoot hardware and software failures uncovered during testing, delivering detailed reports and mitigation strategies.Collaboration: Engage with both internal and external hardware and software engineering teams to pinpoint root causes of failures and implement corrective measures.Test Infrastructure: Aid in the creation and upkeep of the burn-in testing infrastructure, encompassing portable test environments and automation tools operable in any setting.Documentation: Generate and maintain thorough documentation for test plans, test cases, and results.Performance Analysis: Evaluate system performance metrics to identify areas for enhancement.

Jun 11, 2025
Apply
companyEtched logo
Full-time|$150K/yr - $275K/yr|On-site|San Jose

About EtchedEtched is pioneering the development of the world’s first AI inference system specifically designed for transformers, achieving performance metrics that exceed standard models by over 10x while significantly decreasing costs and latency compared to traditional GPUs. With our cutting-edge ASIC technology, we empower the creation of groundbreaking products, such as real-time video generation models and advanced reasoning agents that feature deep and parallel processing capabilities. Supported by substantial investments from leading venture capitalists and a team of top-tier engineers, Etched is at the forefront of transforming the infrastructure landscape in the rapidly evolving AI sector.Job SummaryWe are looking for enthusiastic and talented Supercomputing Engineers (Network) to enhance our dynamic team. This pivotal role involves the development, qualification, and optimization of high-performance networking solutions tailored for extensive inference workloads. As a Pod Software Engineer, your focus will be on creating and validating software that facilitates communication between Sohu inference nodes across multi-rack clusters. You will work in close collaboration with kernel, platform, and telemetry teams to maximize the efficiency of peer-to-peer RDMA communications.Key ResponsibilitiesHigh Performance Peer to Peer Networking: Conceptualize, develop, and implement RDMA-based networking solutions that enable high bandwidth and low latency communication across PCIe nodes, both within and between racks. This role encompasses work across operating systems, kernel drivers, embedded software, and system software.Test Development: Create and implement tests to validate host processors (x86), NICs, TORs, and device network interfaces for optimal performance.Burn-in Integration: Provide burn-in teams with testing frameworks that simulate real-world use cases and workloads for device-to-device networking, including extreme-load stress testing.Performance/Health Telemetry Design: Establish key metrics that system software should gather to ensure high availability and performance under demanding communication workloads.Representative ProjectsEvaluate performance deviations, refine network stack configurations, and suggest kernel tuning parameters for low-latency, high-bandwidth inference workloads.Design and execute automated qualification tests for RDMA NICs and interconnects across a variety of server configurations.Identify and troubleshoot network-related issues to enhance overall system performance.

Jun 11, 2025
Apply
companyEtched logo
Full-time|On-site|San Jose

About EtchedAt Etched, we are pioneering the world's first AI inference system uniquely designed for transformers, achieving over 10x greater performance, along with significantly reduced costs and latency compared to conventional solutions like the B200. Our innovative ASICs enable the creation of groundbreaking products, such as real-time video generation models and highly advanced deep reasoning agents. With substantial backing from premier investors and a team of top engineers, Etched is transforming the infrastructure layer for the fastest growing industry in history.Key ResponsibilitiesDevelop detailed performance models and forecasts for Etched's transformer-centric architecture across various workloads and configurations.Profile and assess deep learning workloads on Etched to detect micro-architectural bottlenecks and potential optimization areas.Create analytical and simulation-driven models to anticipate performance across different architectural setups and design trade-offs.Collaborate with hardware architects to influence micro-architectural decisions based on workload characteristics and performance insights.Facilitate hardware/software co-optimization by pinpointing opportunities where architectural features can substantially enhance performance.Analyze and optimize memory hierarchy efficiency, interconnect utilization, and computational resource effectiveness.Establish performance benchmarking frameworks and methodologies tailored specifically for transformer inference workloads.Performance CharacterizationConstruct detailed roofline models and performance forecasts for Etched across various transformer architectures (e.g., Llama, Mixtral).Profile production inference workloads to identify and mitigate micro-architectural bottlenecks.Evaluate memory bandwidth, compute utilization, and interconnect performance to inform next-gen architecture decisions.Develop performance modeling tools that forecast chip behavior based on different batch sizes, sequence lengths, and model configurations.Characterize the performance implications of architectural features such as specialized datapaths, memory hierarchies, and on-chip interconnects.Benchmark Etched's architectural efficiency against competitive solutions to ensure industry-leading performance.

Nov 10, 2025
Apply
companyAstera Labs logo
Internship|On-site|San Jose, CA

Astera Labs, listed on NASDAQ as ALAB, develops connectivity solutions that power AI infrastructure. The company partners with hyperscalers and others in the ecosystem to help organizations fully utilize modern AI. Their Intelligent Connectivity Platform combines semiconductor technologies like CXL®, Ethernet, NVLink, PCIe®, and UALink™ with the COSMOS software suite, supporting unified and flexible large-scale systems. More information is available at www.asteralabs.com. Role overview This Business Development Intern position for 2026 is based in San Jose, CA. The internship centers on AI hardware and custom silicon, making it a fit for those interested in technology and the challenges of scaling large systems. The role does not require deep technical expertise. Instead, it gives insight into how organizations assess partners for custom silicon projects and how technical concepts become actionable business strategies. Main responsibility Research trends in AI hardware, focusing on accelerators, connectivity solutions, and semiconductor technologies.

Apr 22, 2026

Sign in to browse more jobs

Create account — see all 207 results

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.