companyData Systems Analysts, Inc. logo

HPC Systems Engineer - Advanced Computing Specialist

Data Systems Analysts, Inc.Charlottesville, VA
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

Key Responsibilities:Facilitate the installation, configuration, and maintenance of HPC cluster environments, including compute nodes, schedulers, and associated infrastructure. Configure and manage HPC job schedulers, such as Slurm or PBS, including queue setups, resource allocation policies, and job scheduling enhancements. Support containerized workloads utilizing technologies like Docker, Podman, or Singularity/Apptainer within HPC cluster environments. Assist with cluster provisioning and node management using tools like xCAT, Warewulf, or similar deployment frameworks. Engage in initial cluster build-out activities, including compute node integration, scheduler setup, and validation of provisioning systems. Troubleshoot hardware, operating system, scheduler, and networking challenges affecting HPC cluster performance. Support configuration and performance optimization of high-performance networking technologies, including RDMA-capable interconnects such as InfiniBand. Assist in the integration of compute nodes and hardware components into existing cluster environments. Develop automation and operational tools using Bash, Python, or similar scripting languages.

About the job

All hired employees are expected to have experience with Microsoft Copilot and/or an approved equivalent AI solution.

Description:

Data Systems Analysts, Inc. (DSA) is actively seeking a highly skilled HPC Systems Engineer with TS/SCI clearance to develop and maintain cutting-edge high-performance computing environments in a secure research and development context. The successful candidate will play a crucial role in the installation, configuration, and ongoing support of HPC clusters utilized for distributed simulation tasks, scientific computations, and GPU-driven processing.

The HPC Systems Engineer will collaborate with infrastructure teams to effectively configure, integrate, and sustain HPC cluster platforms, including job schedulers, cluster provisioning systems, high-speed interconnects, and distributed compute nodes. This position emphasizes platform-level troubleshooting, automation, and performance optimization to guarantee HPC systems operate reliably and efficiently for mission-critical users.

Proficiency in Linux systems engineering and administration, along with scripting capabilities, is essential. Familiarity with HPC technologies such as cluster schedulers, distributed computing frameworks, high-performance networking, and GPU compute environments is highly desirable.

This role is based onsite in Charlottesville, VA.

About Data Systems Analysts, Inc.

Data Systems Analysts, Inc. (DSA) is a leader in providing innovative computing solutions and services, specializing in high-performance computing for organizations requiring advanced analytics and simulation capabilities. Committed to excellence and security, DSA delivers state-of-the-art technology solutions tailored to meet the unique needs of our clients.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.