Job Description
Salary: $89,700 - 162,150 per year Requirements:
- I require candidates to be based in the Atlanta, GA area for partial onsite responsibilities.
- Candidates must be U.S. citizens with the ability to obtain a Public Trust Clearance.
- A Bachelor’s degree in computer science or a related field is necessary, along with a minimum of 10 years of experience in System Administration.
- It is essential that candidates have extensive experience (7+ years) in designing and operating high-performance computing (HPC) infrastructure.
- I expect mastery in Linux systems and administration, including troubleshooting, security, performance monitoring, and proficiency in various distributions (e.g., Red Hat, Ubuntu) to support scientific computing.
- Strong problem-solving and communication skills are vital for collaboration with clients, bioinformatics developers, researchers, and leading a team.
- Proficiency in working with network devices, including routers, switches, gateways, and hubs is required.
- Familiarity with developing infrastructure deliverables, continuous diagnostics, and security architecture support is expected.
- Proven leadership in planning and coordinating infrastructure support activities is necessary.
- Candidates should have demonstrated experience with HPC clusters, job schedulers (Slurm), and high-speed networking (10/40/100Gb).
- Proficiency in Bash and Python scripting for automation is essential, along with experience in cloud technologies and container environments (e.g., Docker, Singularity, Kubernetes).
Responsibilities: - My team manages high-performance computing infrastructure by deploying, administering, and monitoring HPC clusters, and we handle multi-petabyte data using Pure Storage flash and AWS S3 Glacier.
- I oversee the installation, maintenance, and upgrading of scientific software, libraries, and batch schedulers such as GridEngine and Slurm while developing effective processes for resource sharing among research teams.
- My responsibilities also include managing the VMware vSphere Foundation for virtual server provisioning, deployment, and configuration, as well as overseeing hardware and software implementation and maintenance.
- I am in charge of system operations which encompass monitoring, routine and ad hoc security patch management, troubleshooting, and performance tuning.
- I advise customers and project managers in designing and documenting technical solutions, supporting infrastructure projects from planning to execution with status updates.
- Leading automation efforts to streamline system management tasks using scripting languages and configuration management tools is a key responsibility.
- I collaborate closely with scientists, bioinformatics developers, and principal investigators to understand their computational requirements and optimize their workflows.
- I lead the technical design, integration, and optimization of both on-site HPC and cloud resources.
- Mentoring other system administrators on best practices for system administration and troubleshooting is part of my role, along with managing the team of administrators.
- Implementing robust security measures and designing architectures that meet compliance standards such as HIPAA or NIST is a priority.
- Designing and implementing backup and disaster recovery plans as well as integrating monitoring and alerting systems for system availability and reliability is essential.
Technologies: - AWS
- Architect
- Azure
- Bash
- Cloud
- Docker
- Hardware
- Support
- Kubernetes
- LAN
- Linux
- Network
- Python
- Security
- Ubuntu
- VPN
- VMware
- Ansible
- Project Manager
- Puppet
More:
- I desire candidates with a Master’s degree in IT, engineering, or related fields. Experience working with a federal government agency or research organization, as well as large-scale infrastructure design, would be advantageous.
- Certifications such as Red Hat Certified Engineer (RHCE), Red Hat Certified Architect (RHCA), or equivalent are preferred.
- Familiarity with computer networking protocols (e.g., TCP, IP, UDP, DHCP, DNS) and understanding of network design (LAN, WAN, VPN) should be expected.
- I value experience in optimizing cloud utilization patterns and supporting development, validation, operations, and security in a hybrid model.
- Certifications in AWS or Azure Cloud engineering are appreciated.
If you're seeking comfort, you may want to look elsewhere. At Leidos, we think outside the box, build innovatively, and advance beyond the norm—because the mission demands it. We're not looking for followers. We want the individuals who challenge the status quo, provoke thought, and refuse to accept failure. We are already moving forward faster than anyone else, leaving the past behind.
last updated 43 week of 2025
Job Tags
Full time,