Tight timeframe to submit - If interested reach out directly to [email protected] or call/text 301-252-8762

Make your application after reading the following skill and qualification requirements for this position.

REQUIRED: US CITIZEN OR PERM GC HOLDER-due to clearacne you will get per gov requirement

Hybrid
This position is primarily located at NIAID’s Rocky Mountain Laboratories campus in Hamilton, MT. You must reside within commuting distance to this client site and be able to be onsite as required to meet contractual obligations and project needs.

In your role as a Senior HPC Architect, you will be a subject matter expert architecting, implementing, and managing multiple high performance compute clusters and their associated infrastructure for a large biomedical research community..

WHAT YOU’LL NEED TO SUCCEED:
Education: BS/BA (or equivalent)
Minimum of 10 years related experience
Minimum of 5 years’ experience as engineer or architect with HPC technologies
Hands-on architecture design experience with HPC to include storage, file system, InfiniBand, security, authentication, and compute architectures
Experience with Slurm job scheduling, including troubleshooting job status and optimizing submission scripts
Experience using Git to manage shared software configuration code bases
Hands-on experience with cloud-based services (e.g. Azure, AWS, GCP)
Minimum of five years’ experience in Linux systems administration
Good understanding of storage administration and optimization, such as performing upgrades and defining RAID configurations
Good understanding of fundamental networking concepts and their practical applications
Experience with Spack or EasyBuild package manager, including making packages from PyPi, R, Github
Knowledge and experience in one or more scripting languages applicable to Linux (e.g. Bash, Perl, Python)

Preferred Skills:
Experience administering RedHat / CentOS based systems
Experience working in a life-sciences oriented environment
Experience configuring and using monitoring systems to monitor HPC clusters
Ability to determine meaningful metrics and usage data for monthly status reports and health dashboards
Experience with DevOps or DevSecOps methodologies, such as automation and configuration management
Strong troubleshooting skills
#M2