[Remote] Principal Engineer, Operational Excellence & Resilience (Remote)

Remote Full-time
Note: The job is a remote job and is open to candidates in USA. CrowdStrike is a global leader in cybersecurity, dedicated to stopping breaches and protecting modern organizations. The Technology Resilience Principal Engineer will lead the technology resilience function, driving strategy and execution of resilience practices across CrowdStrike's technology stack to ensure service reliability and rapid recovery capabilities. Responsibilities • Facilitate coordination between stakeholders across IT, Product, Engineering, and business units, serving as the central point for technology resilience initiatives and ensuring alignment with business objectives • Own and maintain enterprise-wide technology resilience standards, ensuring consistent implementation and reducing organizational drift from established frameworks across infrastructure, application, and product domains • Drive comprehensive technical resilience architecture including infrastructure redundancy and fault tolerance, application resilience and graceful degradation strategies, and chaos engineering frameworks for continuous resilience validation • Lead enterprise technical recovery strategy development and implementation, including backup and redundancy systems, recovery time/point objectives (RTO/RPO) for technical systems, and data recovery/restoration procedures • Partner to define and implement resilience standards, including feature flagging, release, testing, multi-tenancy frameworks, and scalability frameworks to manage growth • Provide technical oversight and aggregation of technology resilience risks across the enterprise, establishing and monitoring key performance indicators including system uptime • Drive chaos engineering and resilience testing programs, establishing enterprise-wide practices for proactive resilience validation and continuous improvement • Own shared resilience tooling strategy, evaluation, and implementation to support enterprise-wide capabilities including monitoring, testing, and recovery automation • Build and maintain formal networks with key constituents across business units, engineering teams, and external partners • Serve as senior technical advisor during major incident response, providing expertise on technical recovery strategies and coordinating cross-functional recovery efforts • Drive innovation in resilience practices, identifying emerging technologies and methodologies to advance CrowdStrike's competitive resilience advantage • Provide strategic guidance and expertise to junior team members and cross-functional partners on resilience engineering best practices Skills • 10+ years of direct experience in technology resilience, disaster recovery, site reliability engineering, or related technical disciplines, with demonstrated expertise in enterprise-scale cloud-native environments • Deep understanding of infrastructure redundancy patterns, application resilience design, chaos engineering principles, and enterprise disaster recovery strategies across hybrid cloud architectures • Proven experience with feature management systems, progressive deployment strategies, multi-tenant architecture resilience, and scalability engineering practices • Proven ability to drive strategic initiatives across large technology organizations, with experience influencing senior stakeholders and leading complex, cross-functional resilience programs • Experience establishing and monitoring resilience KPIs, including system uptime, MTTR, RTO/RPO objectives, and deployment success metrics • Advanced certifications in disaster recovery, cloud architecture, or site reliability disciplines (e.g., DRCS, CISSP, AWS/Azure/GCP architecture certifications) • Exceptional written and oral communication skills, including experience developing and delivering strategic briefings to executive leadership and technical teams • Advanced analytical and conceptual thinking abilities, with proven track record of solving complex, ambiguous resilience challenges with enterprise-wide impact • Demonstrated ability to build formal networks and influence stakeholders across engineering, product, and business organizations • Bachelor's degree in Computer Science, Information Systems, Engineering, Risk/Resilience, or equivalent practical experience • Ability to provide leadership support during crisis events, including nights and weekends when required • Experience leading technology resilience functions in high-growth, cloud-native technology companies • Advanced knowledge of chaos engineering tools and practices (Chaos Monkey, Litmus, Gremlin, etc.) • Experience with modern resilience patterns including circuit breakers, bulkheads, and progressive delivery • Background spanning infrastructure operations, site reliability engineering, and product engineering • Experience with observability and monitoring platforms supporting resilience objectives • Advanced data analytics and visualization experience for resilience metrics and reporting • Deep knowledge of compliance frameworks (ISO27001, ISO22301, SOC2, NIST, FedRAMP) and their intersection with technical resilience • Experience scaling resilience programs and building high-performing resilience engineering teams Benefits • Remote-friendly and flexible work culture • Market leader in compensation and equity awards • Comprehensive physical and mental wellness programs • Competitive vacation and holidays for recharge • Paid parental and adoption leaves • Professional development opportunities for all employees regardless of level or role • Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections • Vibrant office culture with world class amenities • Great Place to Work Certified™ across the globe • Health insurance • 401k • Paid time off Company Overview • CrowdStrike is a cybersecurity technology firm that provides cloud-delivered protection for cloud workloads, identity, and data. It was founded in 2011, and is headquartered in Sunnyvale, California, USA, with a workforce of 5001-10000 employees. Its website is Company H1B Sponsorship • CrowdStrike has a track record of offering H1B sponsorships, with 79 in 2025, 68 in 2024, 95 in 2023, 61 in 2022, 49 in 2021, 22 in 2020. Please note that this does not guarantee sponsorship for this specific role. Apply tot his job
Apply Now

Similar Opportunities

Prior-Authorization Specialist

Remote

Temporary/Contract Prior Authorization Nurse - Hybrid Remote

Remote

SMB Account Executive, Cyber Security & Data Privacy

Remote

System Director, Privacy

Remote

[Remote] CONTRACT Privacy Consultant

Remote

Senior Agency Compliance Officer

Remote

Remote Patient Monitoring Registered Nurse; RN - PRN - Waukesha Memorial Hospital

Remote

Asset Management - Private Wealth Alternatives, Client Advisor, Southeast Region

Remote

IT Procurement Manager - Remote

Remote

Project Management and Process Improvement Administrator

Remote

Beginner Live Chat Specialist Home Office Position No Degree Needed

Remote

Employee Service (ES) Specialist - Customer Success

Remote

Campus Immersion Tutor (C) - Face-to-Face Academic Support for Diverse Students in Houston

Remote

[Hiring] Sr. Consultant, Medical Affairs Advisory @Red Nucleus

Remote

Customer Support Representative for Innovative AI-Driven Project Management Solutions at blithequark

Remote

Employment Law Associate | 100% Fully Remote | Up to $180k | Award-Winning Culture | Top 10 for Diversity in the U.S.

Remote

Experienced Remote Customer Support Specialist – Delivering Exceptional Service from the Comfort of Your Own Home with blithequark

Remote

Experienced TikTok Content Creator - Work from Home Opportunity with Competitive Pay ($25-$35/hr)

Remote

Experienced Remote Data Entry Specialist for Sustainable Energy Leader - Work from Home Opportunity with Tesla

Remote

Technical Coordinator(OverNight Shift) in Canonsburg, PA

Remote
← Back to Home