Resilience, Testability & Scalability Lead

• Posted 60+ days ago • Updated 6 days ago

Full Time

Fitment

Dice Job Match Score™

✨ Finding the perfect fit...

Job Details

Skills

Infrastructure
CI/CD
test automation
Monitoring
stream
scalability
Resilience
Testability
Observability

Summary

Job Title: Resilience, Testability & Scalability Lead
Location: Fort Mill, SC / New York / New Jersey / Remote (Hybrid)

Client: LPL Financial

Employment Type: [Full-time / Contract]

Project: Cloud-Native Enterprise Data Platforms Engineering Quality & Resilience Track

Role Overview:
The ideal candidate will have a deep background in designing highly available systems, implementing robust disaster recovery, managing scalable cloud infrastructure, and building automated, testable, and observable platforms-especially within AWS and Kubernetes environments.

Key Responsibilities:
Design and implement high availability and failover strategies across multi-zone AWS deployments
Lead the development and execution of disaster recovery and business continuity plans, including RTO/RPO validation and cross-region strategies
Define testability strategies, test data management frameworks, and performance testing protocols
Enable infrastructure and application resilience by introducing circuit breakers, retry patterns, service meshes, and graceful degradation mechanisms
Establish real-time monitoring, alerting, and log aggregation frameworks using tools like CloudWatch and Prometheus
Drive test automation and quality engineering best practices, integrating with CI/CD pipelines
Optimize application and data layer performance through query tuning, caching, and indexing strategies
Scale data processing using distributed frameworks like Apache Spark, and implement event-driven stream processing with Kafka
Collaborate with platform, DevOps, and SRE teams to ensure resource efficiency, cost control, and performance SLAs
Contribute to regulatory readiness by enforcing security, encryption, and audit logging standards

Required Skills & Experience:
Infrastructure Resilience & DR:
Multi-AZ deployments, auto-scaling, load balancing, circuit breakers
Disaster recovery design: backup/restore, cross-region replication, RTO/RPO

Monitoring & Observability:
Experience with CloudWatch, Prometheus, log aggregators
Set up alerting for incident response, latency, throughput, and error rates

Application Resilience & Security:
Error handling, service degradation, exponential backoff
Security best practices: IAM policies, encryption at rest/transit
Familiarity with FINRA/SIPC compliance standards (preferred)

Test Automation & Quality:
Unit testing (e.g., pytest), integration testing, E2E automation
Test data generation, synthetic data, environment provisioning
Performance testing using JMeter, Gatling, stress and capacity testing
Code reviews, static analysis, data validation, anomaly detection

Scalability & Optimization:
Horizontal scaling using Kubernetes, Docker, service discovery
API Gateway, caching layers (Redis, Memcached), DB partitioning
Connection pooling, capacity planning, cost-aware architecture

Data & Stream Processing:
Spark cluster management, parallel processing, big data optimization
Kafka-based messaging, windowing, and aggregation for real-time data

Preferred Qualifications:
Experience in financial services or regulated environments
Familiarity with LPL's enterprise data and platform modernization initiatives
AWS or Kubernetes certifications
Strong communication skills and cross-functional collaboration experience

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 10423087
Position Id: 2025-9779
Posted 30+ days ago

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Lead Software Engineer - DevOps

Palo Alto, California

•

Today

Job Description We have an opportunity to impact your career and provide an adventure where you can push the limits of what's possible. As a Lead Software Engineer DevOps at JPMorganChase within the Commercial & Investment Bank Digital Enablement Team, you will apply deep knowledge of software, applications, and technical processes within the infrastructure engineering discipline. You will continue to evolve your technical and cross-functional knowledge outside of your aligned domain of expert

Full-time

USD 152,000.00 - 215,000.00 per year

AWS Kafka Architect

New York

•

11d ago

Job Title: AWS Kafka Architect Work Location- NY, NY (Needs at least 1-2 days onsite) Contract Job Description: 10+ years of experience with AWS and Kafka/Amazon MSK Bachelor's degree in computer science or related fields Minimum 10+ years of industry experience Experience with AWS MSK and/or Confluent Cloud Experience with monitoring tools like CloudWatch, Prometheus and Grafana. Familiarity with Kubernetes, CI/CD pipelines, and GitOps workflows. Proficiency in foundational AWS cloud services

Easy Apply

Contract, Third Party

AWS Kafka Architect

Jersey City, New Jersey

•

6d ago

Job Description: 10+ years of experience with AWS and Kafka/Amazon MSK Bachelor's degree in computer science or related fields Minimum 10+ years of industry experience Experience with AWS MSK and/or Confluent Cloud Experience with monitoring tools like CloudWatch, Prometheus and Grafana. Familiarity with Kubernetes, CI/CD pipelines, and GitOps workflows. Proficiency in foundational AWS cloud services (ex. Cloud Watch, AWS EC2, IAM, VPC, RDS, S3, etc). Design and manage Amazon MSK/ Kafka-based st

Easy Apply

Full-time, Third Party, Contract

Sr. Cloud Engineer

Remote

•

11d ago

We are seeking a highly experienced Senior Cloud Engineer to lead the design, deployment, and management of our cloud infrastructure, ensuring scalable, secure, and reliable cloud-based applications. This is an exciting opportunity for a proactive professional passionate about cloud technology, automation, and continuous improvement to play a pivotal role in supporting our mission-critical systems. Key Responsibilities Cloud Infrastructure Management: Deploy, monitor, and maintain robust cloud e

Easy Apply

Contract, Third Party

Depends on Experience

Search all similar jobs