[Remote] Junior Site Reliability Engineer | Remote US

Remote Full-time
Note: The job is a remote job and is open to candidates in USA. Coalfire is on a mission to make the world a safer place by solving clients’ toughest cybersecurity challenges. As a Junior Site Reliability Engineer, you will support the Managed Services team by ensuring the reliability and scalability of cloud-hosted infrastructure for major clients, utilizing automation and technical skills across various cloud platforms. Responsibilities • Become a member of a highly collaborative engineering team offering a unique blend of Cloud Infrastructure Administration, Site Reliability Engineering, Security Operations, and Vulnerability Management across multiple clients • Coordinate with client product teams, engineering team members, and other stakeholders to monitor and maintain a secure and resilient cloud-hosted infrastructure to established SLAs in both production and non-production environments • Innovate and implement using automated orchestration and configuration management techniques. Understand the design, deployment, and management of secure and compliant enterprise servers, network infrastructure, boundary protection, and cloud architectures using Infrastructure-as-Code • Create, maintain, and peer review automated orchestration and configuration management codebases, as well as Infrastructure-as-Code codebases. Maintain IaC tooling and versioning within Client environments • Implement and upgrade client environments with CI/CD infrastructure code and provide internal feedback to development teams for environment requirements and necessary alterations • Work across AWS, Azure and GCP, understanding and utilizing their unique native services in client environments • Configure, tune, and troubleshoot cloud-based tools, manage cost, security, and compliance for the Client’s environments • Monitor and resolve site stability and performance issues related to functionality and availability • Work closely with client DevOps and product teams to provide 24x7x365 support to environments through Client ticketing systems • Support definition, testing, and validation of incident response and disaster recovery documentation and exercises • Participate in on-call rotations as needed to support Client critical events, and operational needs that may lay outside of business hours • Support testing and data reviews to collect and report on the effectiveness of current security and operational measures, in addition to remediating deviations from current security and operational measures • Maintain detailed diagrams representative of the Client’s cloud architecture • Maintain, optimize, and peer review standard operating procedures, operational runbooks, technical documents, and troubleshooting guidelines Skills • BS or above in related Information Technology field or equivalent combination of education and experience • 2+ years experience in 24x7x365 production operations • Fundamental understanding of networking and networking troubleshooting • 2+ years experience installing, managing, and troubleshooting Linux and/or Windows Server operating systems in a production environment • 2+ years experience supporting cloud operations and automation in AWS, Azure or GCP (and aligned certifications) • 2+ years experience with Infrastructure-as-Code and orchestration/automation tools such as Terraform and Ansible • Experience with IaaS platform capabilities and services (cloud certifications expected) • Experience within ticketing tool solutions such as Jira and ServiceNow • Experience using environmental analytics tools such as Splunk and Elastic Stack for querying, monitoring and alerting • Experience in at least one primary scripting language (Bash, Python, PowerShell) • Excellent communication, organizational, and problem-solving skills in a dynamic environment • Effective documentation skills, to include technical diagrams and written descriptions • Ability to work as part of a team with professional attitude and demeanor • Previous experience in a consulting role within dynamic, and fast-paced environments • Previous experience supporting a 24x7x365 highly available environment for a SaaS vendor • Experience supporting security and/or infrastructure incident handling and investigation, and/or system scenario re-creation • Experience working within container orchestration solutions such as Kubernetes, Docker, EKS and/or ECS • Experience working within an automated CI/CD pipeline for release development, testing, remediation, and deployment • Cloud-based networking experience (Palo Alto, Cisco ASAv, etc.…) • Familiarity with frameworks such as FedRAMP, FISMA, SOC, ISO, HIPAA, HITRUST, PCI, etc • Familiarity with configuration baseline standards such as CIS Benchmarks & DISA STIG • Knowledge of encryption technologies (SSL, encryption, PKI) • Experience with diagramming (Visio, Lucid Chart, etc.) • Application development experience for cloud-based systems Benefits • Paid parental leave • Flexible time off • Certification and training reimbursement • Digital mental health and wellbeing support membership • Comprehensive insurance options Company Overview • Coalfire is the premier Cybersecurity and Compliance Services leader for the tech, healthcare, and finance industries. It was founded in 2001, and is headquartered in Chicago, Illinois, US, with a workforce of 1001-5000 employees. Its website is Company H1B Sponsorship • Coalfire has a track record of offering H1B sponsorships, with 2 in 2025, 4 in 2024, 3 in 2023, 6 in 2022, 2 in 2021, 4 in 2020. Please note that this does not guarantee sponsorship for this specific role. Apply tot his job
Apply Now
← Back to Home