Job Description: • Interface with stakeholders and capture requirements, plan the execution of technical requirements and provide verbal and written briefs to gov. stakeholders.
• Interface with IA/J6 to provide necessary documentation and artifacts through the RMF process leading to full ATO.
• Assist with deployment and mission command system architecture on AWS & Azure IL5, IL6, and IL7 environments.
• Designing and implementing automation tools and frameworks for continuous integration, delivery, and deployment.
• Monitoring and managing infrastructure, ensuring optimal performance, security, and scalability.
• Assist in architecting universally available, on-prem Kubernetes RKE2 deployment model for mission-critical customers.
• Assist in streamlining deployment automation using Ansible, and other tools.
• Assist with designing configuring and maintaining infrastructure and/or cloud resources necessary to reliably host, run, and support applications through all phases of the software development life cycle (SDLC).
• Create and maintain continuous integration and deployment (CI/CD) pipelines.
• Write scripts and configurations to automate software testing, deployment, and maintenance in a manner consistent with software engineering best practices.
• Proactively monitor build and deploy pipelines and infrastructure, apply patches, troubleshoot issues, and resolve errors.
• Automate infrastructure and cloud resource deployments and configuration using infrastructure-as-code tools.
• Work with engineering teams to create CI/CD pipelines for their applications and ensure code is properly integrated into the CI/CD pipelines.
• Configure and support development environments for consistent development processes across devices.
• Please participate in the design of the version control system and configure it to align with the software development life cycle (SDLC) and security policies and best practices.
• Automate security testing and vulnerability scanning and integrate it into the development process.
• Remediate findings from vulnerability scanning and penetration testing.
• Assist with training users on both new and existing functionality.
• Informally mentor other staff in DevOps concepts and processes.
• Collaborate with peers and business partners to identify workflows or processes where automation can improve efficiency and or reduce costs.
• Leverage multivendor APIs and write scripts to create solutions to improve existing processes.
• Engage internal departments and third-party vendors for security and infrastructure support and best practices.
• Troubleshoot technical issues, identify the cause, determine resolutions, and remediate issues in existing applications.
• Deploy and maintain MinIO in distributed mode. This enables data to be stored across multiple nodes, ensuring high availability and fault tolerance.
• Implement multiple control-plane nodes with a load balancer in front to distribute the traffic and ensure high availability.
• Ensure a multi-node setup for GitOps tools and store configurations redundantly across nodes.
• Regular backups of configurations are essential.
• Utilize Patroni / Spilo / Scalyr for automating failover to a replica in case the primary node fails.
• Regular backups and real-time replication should also be implemented.
• Ensuring uptime, managing failovers, and scaling as needed.
• Update containers baseline, and apply systems patching.
• Responsible for the overall functionality and maintenance of the HA cluster
• Work closely with the DEVSECOP team to coordinate the deployment of updates and other security patches.
Qualifications: • Interface with stakeholders and capture requirements, plan the execution of technical requirements and provide verbal and written briefs to gov. stakeholders.
• Interface with IA/J6 to provide necessary documentation and artifacts through the RMF proc.
• Minimum of 7 years as DevOps engineers supporting DoD projects
• Must be a US citizen with Active DoD security clearance.
• Direct experience architecting, deploying, and maintaining IL6, and IL7 cloud infrastructure.
• Linux system administration experience
• EKS, AKS deployment experience
• CKA, CKAD, or other relevant Kubernetes certifications or experience
• Experience architecting, deploying, and maintaining high availability Kubernetes clusters in an air-gapped environment.
• Familiar with DoD system security requirements and processes such as RMF, DISA STIG, GRC guidelines for cloud services, container hardening, security patching, etc.
• Strong technical communication skills.
• Ability to generate and brief architecture design and network diagrams.
• Maintain a minimum of Security+ certification.
• Networking experience is a plus.
• Motivated individual, willing to work in a small team environment, assume different responsibilities when required.
• Problem-solving skills with a strong communication skill set.
• Familiarity with AWS/Azure cloud automation tools such Terraform, GitOps, S3, EKS, AKS, EKS, RDS, IAM, CloudWatch, etc., deployment.
• AWS & Azure certification
• Experience architecting high availability core services such HA PostgreSQL, Minio, Ingress NGINX, Key cloak, etc.